Data-Syntactic Regression of Ill-Being

: Panel data transcends cross-sectional data by tapping pooled inter- and intra-individual differences, along with between and within individual variation separately. In the present study these micro variations in ill-being are predicted by psychological indicators constructed from the British Household Panel Survey (BHPS). Panel regression effects are corrected for errors-in-variables, which attenuate slopes estimated by traditional panel regressions. These corrections reveal that unhappiness and life dissatisfaction are distinct variables that have different psychological causations. error (MSE) for each of the six syntactic regressions. The ratios of the holdout MSEs to the training-sample MSEs ranged from .992 to 1.004, showing very slight loss in predictive power for true-value regression coefficients.


emphasized that
The history of the concept of happiness, stretching over two thousand years, is not a simple one. In early times 'happiness' meant simply success. Then for centuries, covering much of antiquity and the middle ages, it signified a man's perfect condition or the possession by him of the highest virtues and goods. Modern times reduced happiness to pleasure. Today, without discarding these earlier notions, we employ yet another concept of happiness. … For each man is either satisfied or dissatisfied with his life and gives expression to this feeling. But to be satisfied with life is not the same thing as to attain perfection or to enjoy a long succession of pleasures.
This modern conflation of happiness and satisfaction was picked up in the 1970's by the social indicators movement in three different ways. First, Andrews and Withey (1974) introduced the following life quality scale in the lead article of the maiden issue of Social Indicators Research: Terrible Unhappy Mostly Equally satisfied Mostly Pleased Delighted dissatisfied and dissatisfied Satisfied This sequence of response labels presumes that (dis)satisfaction constitutes the core of a wellbeing continuum, and that (un)happiness constitutes the extremes of this single dimension.
Disorders..…In the same way that depression requires symptoms of an-hedonia, mental health consists of symptoms of hedonia such as emotional vitality and positive feelings towards one's life. In the same way that major depression consists of symptoms of mal-functioning, mental health consists of symptoms of positive functioning.
A rather extreme application of Keyes observation has been proposed by Huppert and So (2013, p. 837): A conceptual framework is offered which equates high well-being with positive mental health. Wellbeing is seen as lying at the opposite end of a spectrum to the common mental disorders (depression, anxiety).
Factor analyzing items from the third round of the ESS, these authors find two factors, "feeling good" and "functioning well" (cf. Keyes, 2007), which they propose as separate dimensions of well-being or flourishing.
Rather than accepting this new conflation of ill-being and depression, the present paper retains unhappiness and dissatisfaction as the classic dimensions of ill-being. We then find these dimensions to be driven in very different ways by affective and cognitive depression, i.e. "feeling bad" and "functioning badly" (cf. Huppert and So, 2013). Thus, unlike these later authors, we regard depression as a cause rather than the essence of ill-being.

The study plan
Section 2 describes the two depression factors we extract from The United Kingdom's General Health Questionnaire, which is included in the British Household Panel Survey. Section 3 deconstructs our BHPS depression and ill-being scores into true scores and measurement errors. Section 4 describes an unbalanced census panel and the weighted sampling of panelists from this census. Sections 5 and 6 generalize randomization-based panel regressions (Bechtel, 2014) to true-value panel regressions of overall-, between-, and within-panelist data. Section 7 shows that all three data syntaxes sustain a dramatic role reversal for affective and cognitive depression in predicting ill-being. Section 8 points up the usefulness of true-value regression in a) removing measurement error from survey scores and b) sharpening the distinction between unhappiness and life dissatisfaction in social-indicators research. This final section also advocates true-value regression in the data syntaxes of cross-national surveys carried out at a single time point.

Dimensions of Depression: The General Health Questionnaire
Links between social indicators and health have been investigated by the United Kingdom's Health Development Agency. In one of the studies funded by this agency, Pevalin (2000, p.508) notes The General Health Questionnaire (GHQ) has been used as a screening instrument for minor psychiatric disturbance in numerous clinical studies as well as an indicator of psychiatric morbidity in large-scale community-based surveys.
… The aim of this study was to examine data from a large general population sample for evidence of any retest effects over 7 yearly applications. Methods: A core panel was drawn from the British Household Panel Survey of those respondents who had completed the GHQ-12 seven times from 1991 to 1997 (n = 4749). The panel results were compared with cross-sectional data from the Health Surveys for England for the same years. … Results: No evidence of retest effects was found. … Conclusion: The GHQ-12 is a consistent and reliable instrument when used in general population samples with relatively long intervals between applications.
In his review of the GHQ for Occupational Medicine, Jackson (2007, p.79) indicates that this instrument … has been translated into 38 different languages, testament to the validity and reliability of the questionnaire. … Possibly, the most common assessment of mental well-being is the GHQ. Developed as a screening tool to detect those likely to have or be at risk of developing psychiatric disorders, it is a measure of the common mental health problems … Table 2 exhibits the twelve items making up a short version of the General Health Questionnaire known as the GHQ-12. (The second dissatisfaction item in Table 2 is not part of this questionnaire.) The GHQ-12 is a dedicated, widely-used psychiatric instrument whose interitem correlations range between .21 and .69. This correlation range is considerably higher than that (.10 to .49) for the ad hoc depressive antonyms used by Huppert and So (2013).
T. G. Bechtel (2007) found the GHQ-12 to be the strongest predictor of self-reported life satisfaction on the BHPS. The present study pursues this finding by factoring GHQ-12 items into two dimensions and using the resulting depression scales as predictors of unhappiness and life dissatisfaction separately. Our analysis is restricted to the eleven symptomatic items in the GHQ-12 because the item on overall unhappiness is our dependent variable. These eleven symptoms were submitted to the oblique factor analysis exhibited in Table 1. The items marking the two oblique dimensions are described in Tables 1 and 2 under the labels of affective and cognitive depression. These terms, which connote distressing experience and the inability to carry out normal functions, are antonyms of Huppert's and So's (2013) "feeling good" and "functioning well". However, due to the higher correlations among GHQ-12 items than among Huppert's and So's ESS items, the factor loadings in Table 1 exhibit a sharper distinction between these two dimensions. Note. These rotated factor loadings were obtained from an oblimax rotation of principal components.

Observed Scores, True Scores, and Measurement Error
By including the GHQ-12 and various satisfaction items, the BHPS (http://www.esds.ac.uk/longitudinal) allows us to assess the effects of affective and cognitive depression on unhappiness and life dissatisfaction. Table 2 exhibits the 13 BHPS items that measure these four constructs.

Equally-spaced response codes
Unhappiness. The first item in Table 2 is actually the last item on the GHQ-12. We code its responses in three equal steps between zero and ten. Let Uit be our coding of the response label chosen by individual i on wave t. We deconstruct this coding as Uit = it + Kit , (3.1) where it is true unhappiness and Kit is a coding error in measuring this characteristic. The values it and Kit lie on a continuous interval scale whose origin and unit are set by coding the extreme response labels, more so than usual and much less than usual, as zero and ten.
Measurement error is the departure of respondent i's selected coding, 0, 3.33, 6.67, or 10, from her (his) true value it .
Life dissatisfaction. The second item in Table 2 is coded in six equal steps. Again the origin and unit of this scale are set by coding completely satisfied as zero and not satisfied at all as ten. Let Dit denote our coding of the response label that i selects on wave t. Then Dit = it + Fit , (3.2) where it is i's true dissatisfaction on the continuous scale [0, 10]. Measurement error Fit is the departure of the selected coding, 0, 1.67, 3.33, 5, 6.67, 8.33, or 10, from it .
The common interval scale. The interval scale shared by all items in Table 2 allows a comparison of the unhappiness and dissatisfaction regression slopes computed in the present paper. This scale tolerates  the different numbers of response options for the first two items in Table 2, and  the equal spacing of response codes, which depart from true values. These departures Kit and Fit are assumed to satisfy the classical error properties given in the last paragraph of Section 5.1.

Multiple-item scores
Affective depression. An individual's affective depression score, which also contains measurement error, is an average of six item responses. Still referring to Table 2, we deconstruct i's affective-depression item coding on wave t as Aitm = it + Gitm for m = 1 … 6. On wave t each item m measures i's true affective depression it with error Gitm on our common scale [0,10], i.e. Gitm is the departure of coding Aitm (= 0, 3.33, 6.67, or 10) from it .
3) is i's error score on wave t, which is the deviation of her (his) score Ait from it.
Cognitive depression. Finally, we derive cognitive depression scores as averages of five item response codes. A cognitive-depression item coding in Table 2 is decomposed as where item m measures i's true cognitive depression it on wave t with error Hitm . Her (his) cognitive depression score on wave t is Cit = (Cit1 + Cit2 + Cit3 + Cit4 + Cit5)/5 = it + (Hit1 + Hit2 + Hit3 + Hit4 + Hit5)/5 = it + Hit . (3.4) The error score Hit in (3.4) is the deviation of observed score Cit from it on our common interval scale [0,10].

Census and Sample Panels
Define a vector Yit  (Uit Dit Ait Cit) T whose elements are the scores in (3.1)  (3.4). The term panelist denotes an intra-individual sequence of vectors Yit illustrated by a single row in Table 3. The rows of Table 3 make up an unbalanced census panel in which panelists have different numbers of wave appearances due to attrition and/or late panel entry. More generally, in an unbalanced census panel each panelist i appears in t = 1 … Ti waves for i = 1…N. Subscript t = 1 denotes individual i's first appearance even though her (his) panel entry may occur later than wave 1.
The italic boldface rows in Table 3 illustrate an unbalanced sample panel of n = 3 panelists drawn from an unbalanced census panel of N = 7 panelists. Each longitudinal weight in the last column of Table 3 is calculated from the probability of that sampled individual being monitored over her (his) particular sequence of waves within a time span of four waves. For example, the weight w2 reflects the probability of panelist 2 being drawn in the sample at wave 1 and giving a full interview on waves 1, 2, 3 and 4. The construction of longitudinal weights for sampled panelists is given in detail by the BHPS (http://www.esds.ac.uk/longitudinal). These longitudinal weights are used in all six true-value regressions reported in Section 7.3. Table 3 illustrates the data used in three kinds of panel regressions developed in Sections 5 and 6. Each data-syntactic regression is posed for our hypothetical census panel and run for our realized sample panel. Thus, in Table 3 an un-weighted census regression is posed over 22 individual-wave observations. This census regression generates target parameters, which are estimated by a weighted sample regression run over 9 individual-wave observations.

True parameter identification
Following our illustration in Section 4, we posit sets of census scores, true scores, and error scores: Uit Dit Ait Cit, {it it it it, and Kit Fit Git Hit, where t = 1…Ti for i = 1… N. The first set is hypothetically computed from a posited census of the 13 items described in Section 3. The second set of true scores is in one-to-one correspondence with the set of census scores. The third set is the difference set of error scores.
True unhappiness and dissatisfaction regressions may then be written as  it = κ +  it + q it + e it and (5.1a) The census parameters  and  in the above error sums of squares are internal-consistency reliabilities used in psychological testing (Lord and Novick,1968;Nunnally and Bernstein, 1994). Alternative formulas for internal-consistency reliability, also known as coefficient alpha (Cronbach, 1952)    An identical procedure is carried out for obtaining the standard errors of the intercept and slopes in (5.3b). Table 6 reports the standard errors given by (5.6) for data syntax it in the column labeled "Between-and-within panelists". Under (6.1), it is easy to show that mean errors are uncorrelated with each other and with mean true scores. Then (5.2.a) and (5.2b), with subscript it overwritten by i., identify the true slopes of i. on i. and i. and i. on i. and i.. Equations (5.3a) and (5.3b), again with i. overwriting it, estimate these true slopes by adjusting for measurement errors in mean scores Ai. and Ci.. Slope standard errors, estimated from (5.6) in syntax i., are reported in the "Between panelists" column of Table 6. Slope standard errors in syntax it  i. could not be estimated from (5.6) due to convergence failure. Therefore, we report underestimated standard errors from (5.5) in the "Within panelists" column in Table 6. Our resort to (5.5) assumes that the correction matrix  in (5.3a) and (5.3b) is fixed rather than random.

Correlations between unhappiness and life dissatisfaction
The distinction between unhappiness and life dissatisfaction is demonstrated in Table 4 by the low correlations between scores Uit and Dit , individual means Ui. and Di., and withinindividual deviations Uit-i. and Dit-i.. The between-panelist correlation exceeds the overall correlation which, in turn, surpasses the within-panelist correlation. This rank order is expected because inter-individual differences are greater than intra-individual differences.
These low correlations in all three data syntaxes reject the conflation between unhappiness and dissatisfaction perpetuated by the social indicators movement (cf. Section 1).  Table 5 exhibits reliabilities of our two depression scores computed from formulas in the Appendix. Each of these alpha coefficients is a Horvitz-Thompson-type estimate of internalconsistency reliability. The first row of Table 5 gives  ˆ in (5.4a) for the i. , it, and it  i. data syntaxes. The second row displays   in (5.4b) for the same syntaxes. These alpha coefficients are based on the depression items in Table 2.

Reliabilities of Depression Scores
Two main effects stand out in Table 5. First, affective depression is more reliably measured than cognitive depression. Second, between-panelist means are more reliable than overall panel scores which, in turn, are more reliable than within-panelist deviations. This order of reliabilities is expected from the fact that inter-individual variation exceeds intra-individual variation. The standard errors of these alpha coefficients show that the two main effects in Table 5 are highly significant.

Effects of depression on ill-being
The reliability coefficients in Table 5 correct the regression slopes in Table 6 for attenuation due to measurement errors in our depression scores. The slopes in Table 6 reveal a striking reversal for affective and cognitive depression underlying unhappiness and life dissatisfaction. Cognitive depression is the primary driver of unhappiness, whereas affective depression is the main cause of life dissatisfaction. This distinction is consistent with the low correlations between unhappiness and dissatisfaction in Table 4.
Unhappiness. The slopes in the first two rows of Table 6 were computed from Formula (5.3a) in the i., it, and it  i. data syntaxes. These slopes show that cognitive depression dominates affective depression in driving unhappiness in all three syntaxes. Life dissatisfaction. Conversely, the slopes in the last two rows of Table 6 demonstrate the dominance of affective depression in generating life dissatisfaction in each data syntax. These slopes were computed from Formula (5.3b).
Cross validation. The full-sample results in Tables 1 and 6 were cross-validated on a holdout sample (http://en.wikipedia.org/wiki/Cross-validation_(statistics)). First, the BHPS sample was randomly split into a training dataset of 34580 cases and a testing dataset of 34846 cases. Next, the factor analysis in Section 2, and the regressions in Sections 5 and 6, were run on the training sample. These analyses again produced the affective and cognitive depression factors in Table 1 and the syntactic regression pattern in Table 6. Finally, the training-sample regression coefficients were applied to the holdout sample's depression scores. This generated a holdout mean squared error (MSE) for each of the six syntactic regressions. The ratios of the holdout MSEs to the training-sample MSEs ranged from .992 to 1.004, showing very slight loss in predictive power for true-value regression coefficients. Notes: Each of these six analyses is a weighted true-value regression in a panel data syntax. The standard errors (in parentheses) for syntaxes i. and it were iteratively obtained from (5.6). Iterative variance estimation failed to converge in syntax it  i., and starting values in (5.5) were used as variance estimates.
This assumes that the correction matrix  = diag (0  ˆ ˆ) in (5.3a) and (5.3b), when overwritten for deviation data, is fixed rather than random (cf. Section 5.2 and Bechtel, 2010, Appendix). Thus, the standard errors in the last column are understated for the two within-panelists regressions. The hypothesis of slope equality was rejected at the .0000 level of significance in each i. and it regression. This hypothesis was also rejected at the .0000 level in each it i. regression using doubled standard errors. The number of observations in each unhappiness regression is 68498. The number of observations in each dissatisfaction regression is 62204 because the BHPS did not include the life satisfaction item in 2001

True-Value Regression in Micro Data Syntaxes
True-value regression accurately estimates relationships between population variables at different levels of important micro data: Panel data. In the present paper micro variations in ill-being are predicted by two depression indicators constructed from the British Household Panel Survey. Panel regression effects are corrected for errors in these predictors, which attenuate slopes estimated by traditional panel regressions. These corrections, carried out on a large high-quality dataset, reveal that unhappiness and life dissatisfaction are distinct variables with very different psychological causations.
True effects of affective and cognitive depression are measured in three data syntaxes: between individuals, within individuals, and between and within individuals overall. Table 6 exhibits a striking role reversal between cognitive and affective depression in each syntax. Cognitive depression drives unhappiness, whereas affective depression drives life dissatisfaction.
These distinctive psychological processes explain the low correlations between unhappiness and life dissatisfaction reported in Table 4, also in each data syntax. Clearly, future efforts to measure ill-being should untangle the conflation of unhappiness and life dissatisfaction that has pervaded social-indicators research since the 1970's.
Cross-national data. In addition to panel data from a single nation, cross-national datasets may be syntactically regressed. Letting c denote a country and i an individual, our three syntaxes become: between countries, within countries, and between and within countries overall. Future cross-national research, with high-quality datasets such as the European Social Survey (Fitzgerald, 2013), would benefit from true-value regressions run in each of these three data syntaxes. The added values delivered by such analyses are a) more accurate estimation of regression slopes, b) multi-level assessment of these slopes, and c) confirmation of micro relationships over data syntaxes. These advantages exist for each survey and, in the case of repeated surveys like the European Social Survey, syntactic changes in micro relationships can be monitored over time.

Appendix: Coefficient Alpha
In Section 2 and Table 2 let Sitm be a [0,10] item score and Sit be a [0,10] construct score. Thus, an M-item construct score Sit is the average M 1 mSitm of its item scores, where M = 6 for affective depression, and M = 5 for cognitive depression.
This Appendix gives a construct's census alpha, its sample estimate, and the sampling variance of this estimate. The estimated alphas for affective and cognitive depression, along with their standard errors, appear in the between-and-within column of Table 5 for syntax it.

A.1 Census Definition of Alpha in Data Syntax it
Coefficient alpha is the standard measure of internal-consistency reliability of a psychological test. Bechtel (2013) shows that this coefficient is the ratio of a construct's truescore variance to its observed-score variance. However, in defining alpha it is not necessary to have a construct's true scores. The power of this coefficient lies its definition of reliability solely in terms of observable item scores Sitm and their average (construct) score Sit .
We write the census coefficient alpha as  The square root of Var( ) is the standard error for each alpha coefficient in the betweenand-within column of Table 5.

A.3 The Reliability of i. and it i. Panel Scores
The alpha coefficients and standard errors in the between column of Table 5 are obtained by overwriting the subscript it by i. in this Appendix. The estimates in the within column of Table  5 are given by overwriting it by it i. . Note that a construct or item score has the same overall mean in the it and i. data syntaxes. In syntax it i. this overall mean is zero.