Abstract: Background: Brass developed a procedure for converting proportions dead of children ever born reported by women in childbearing ages into estimates of the probability of dying before attaining certain exact childhood ages. The method has become very popular in less developed countries where direct mortality estimation is not possible due to incomplete death registration. However, the estimates of q(x), the probability of dying before age x, obtained by Trussell’s variant of Brass method are sometimes unrealistic, q(x) being not monotonically increasing for increasing x. Method: State level child mortality estimates obtained by Trussell’s variant of Brass method from 1991 and 2001 Indian census data were made monotonically increasing by logit smoothing. Using two of the smoothed child mortality estimates, infant mortality estimate is obtained by fitting a two parameter Weibull survival function. Results: It has been found that in many states and union territories infant mortality rates have increased between 1991 and 2001. Cross checking with the 1991 and 2001 census data on the increase/decrease of percentage of children died establishes the reliability of the estimates. Conclusion: We have reason to suspect the trend of declining infant mortality as shown by the different agencies and researchers.
Abstract: A multilevel model (allowing for individual risk factors and geo graphic context) is developed for jointly modelling cross-sectional differences in diabetes prevalence and trends in prevalence, and then adapted to provide geographically disaggregated diabetes prevalence forecasts. This involves a weighted binomial regression applied to US data from the Behavioral Risk Factor Surveillance System (BRFSS) survey, specifically totals of diagnosed diabetes cases, and populations at risk. Both cases and populations are dis aggregated according to survey year (2000 to 2010), individual risk factors (e.g., age, education), and contextual risk factors, namely US census division and the poverty level of the county of residence. The model includes a linear growth path in decadal time units, and forecasts are obtained by extending the growth path to future years. The trend component of the model controls for interacting influences (individual and contextual) on changing prevalence. Prevalence growth is found to be highest among younger adults, among males, and among those with high school education. There are also regional shifts, with a widening of the US “diabetes belt”.
Abstract: Correlation coefficients are generally viewed as summaries, causing them to be underutilized. Creating functions from them leads to their use in diverse areas of statistics. Because there are many correlation coefficients (see, for example, Gideon (2007)) this extension makes possible a very broad range of statistical estimators that rivals least squares. The whole area could be called a “Correlation Estimation System.” This paper outlines some of the numerous possibilities for using the system and gives some illustrative examples. Detailed explanations are developed in earlier papers. The formulae to make possible both the estimation and some of the computer coding to implement it are given. This approach has been taken in hopes that this condensed version of the work will make the ideas accessible, show their practicality, and promote further developments.
Abstract: The Center for Neural Interface Design of the Biodesign Institute at Arizona State University conducted an experiment to investigate how the central nervous system controls hand orientation and movement direction during reach-to-grasp movements. ANOVA (Analysis of Variance), a conventional data analysis widely used in neural science, was performed to categorized different neural activities. Some preliminary studies on data analysis methods have shown that the principal assumption of ANOVA is violated and some characteristics of data are missing from taking the ratio of recorded data. To compensate the deficiency of ANOVA, ANCOVA (Analysis of covariance) is introduced in this paper. By considering neural firing counts and temporal intervals respectively, we expect to extract more useful information for determining the correlations among different types of neurons with motor behavior. Comparing to ANOVA, ANCOVA can be one step further to identify which direction or orientation is favored during which epoch. We find that a considerable number of neurons are involved in movement direction, hand orientation, or both combined, and some are significant in more than one epoch, which indicates there exists a network with unknown pathways connecting neurons in motor cortex throughout the entire movement. For the future studies we suggest to integrate this study into neural networking in order to simulate the whole reach-to-grasp process.
Abstract: Nowadays, extensive amounts of data are stored which require the development of specialized methods for data analysis in an understandable way. In medical data analysis many potential factors are usually introduced to determine an outcome response variable. The main objective of variable selection is enhancing the prediction performance of the predictor variables and identifying correctly and parsimoniously the faster and more cost-effective predictors that have an important influence on the response. Various variable selection techniques are used to improve predictability and obtain the “best” model derived from a screening procedure. In our study, we propose a variable subset selection method which extends to the classification case the idea of selecting variables and combines a nonparametric criterion with a likelihood based criterion. In this work, the Area Under the ROC Curve (AUC) criterion is used from another viewpoint in order to determine more directly the important factors. The proposed method revealed a modification (BIC) of the modified Bayesian Information Criterion (mBIC). The comparison of the introduced BIC to existing variable selection methods is performed by some simulating experiments and the Type I and Type II error rates are calculated. Additionally, the proposed method is applied successfully to a high-dimensional Trauma data analysis, and its good predictive properties are confirmed.
Abstract: A randomly truncated sample appears when the independent variables T and L are observable if L < T. The truncated version Kaplan-Meier estimator is known to be the standard estimation method for the marginal distribution of T or L. The inverse probability weighted (IPW) estimator was suggested as an alternative and its agreement to the truncated version Kaplan-Meier estimator has been proved. This paper centers on the weak convergence of IPW estimators and variance decomposition. The paper shows that the asymptotic variance of an IPW estimator can be decom posed into two sources. The variation for the IPW estimator using known weight functions is the primary source, and the variation due to estimated weights should be included as well. Variance decomposition establishes the connection between a truncated sample and a biased sample with know prob abilities of selection. A simulation study was conducted to investigate the practical performance of the proposed variance estimators, as well as the relative magnitude of two sources of variation for various truncation rates. A blood transfusion data set is analyzed to illustrate the nonparametric inference discussed in the paper.
Abstract: Loss of household income and purchasing power are shown to have broad and negative societal effects. The economic anxiety accompanying this loss has its strongest impact on consumer demand, which is the major factor in a nation’s gross domestic product (GDP). Negative effects of economic anxiety are also found on the propensity to vote, political trust, societal satisfaction, and the quality of life. These effects were verified in a cross national sample from the fifth round of the European Social Survey. Simple regression of the true value of consumer demand, etc. on the true value of economic anxiety is made possible by an estimate of the reliability of our economic-anxiety score (cf. Bechtel, 2010; 2011; 2012). This reliability estimate corrects the regression slope of each societal variable for measurement error in the anxiety score.
Abstract: Bivariate data analysis plays a key role in several areas where the variables of interest are obtained in a paired form, leading to the con sideration of possible association measures between them. In most cases, it is common to use known statistics measures such as Pearson correlation, Kendall’s and Spearman’s coefficients. However, these statistics measures may not represent the real correlation or structure of dependence between the variables. Fisher and Switzer (1985) proposed a rank-based graphical tool, the so called chi-plot, which, in conjunction with its Monte Carlo based confidence interval can help detect the presence of association in a random sample from a continuous bivariate distribution. In this article we construct the asymptotic confidence interval for the chi-plot. Via a Monte Carlo simulation study we discovery the coverage probabilities of the asymptotic and the Monte Carlo based confidence intervals are similar. A immediate advantage of the asymptotic confidence interval over the Monte Carlo based one is that it is computationally less expensive providing choices of any confidence level. Moreover, it can be implemented straightforwardly in the existing statistical softwares. The chi-plot approach is illustrated in on the average intelligence and atheism rates across nations data.
Abstract: This paper develops a generalized least squares (GLS) estimator in a linear regression model with serially correlated errors. In particular, the asymptotic optimality of the proposed estimator is established. To obtain this result, we use the modified Cholesky decomposition to estimate the inverse of the error covariance matrix based on the ordinary least squares (OLS) residuals. The resulting matrix estimator maintains positive definite ness and converges to the corresponding population matrix at a suitable rate. The outstanding finite sample performance of the proposed GLS estimator is illustrated using simulation studies and two real datasets.
Abstract: Observational studies of relatively large data can have potentially hidden heterogeneity with respect to causal effects and propensity scores–patterns of a putative cause being exposed to study subjects. This underlying heterogeneity can be crucial in causal inference for any observational studies because it is systematically generated and structured by covariates which influence the cause and/or its related outcomes. Addressing the causal inference problem in view of data structure, machine learning techniques such as tree analysis can be naturally necessitated. Kang, Su, Hitsman, Liu and Lloyd-Jones (2012) proposed Marginal Tree (MT) procedure to explore both the confounding and interacting effects of the covariates on causal inference. In this paper, we extend the MT method to the case of binary responses along with a clear exposition of its relationship with established causal odds ratio. We assess the causal effect of dieting on emotional distress using both a real data set from the Lalonde’s National Supported Work Demonstration Analysis (NSW) and a simulated data set from the National Longitudinal Study of Adolescent Health (Add Health).