Abstract: Linear regression models are often useful tools for exploring the relationship between a response and a set of explanatory (predictor) variables. When both the observed response and the predictor variables are contaminated/distorted by unknown functions of an observable confounder, inferring the underlying relationship between the latent (unobserved) variables is more challenging. Recently, S¸ent¨urk and M¨uller (2005) proposed the method of covariate-adjusted regression (CAR) analysis for this distorted data setting. In this paper, we describe graphical techniques for assessing departures from or violations of specific assumptions regarding the type and form of the data distortion. The type of data distortion consists of multiplicative, additive or no-distortion. The form of the distortion encompasses a class of general smooth distorting functions. However, common confounding adjustment methods in regression analysis implicitly make distortion assumptions, such as assuming additive or multiplicative linear distortions. We illustrate graphical detection of departures from such assumptions on the distortion. The graphical diagnostic techniques are illustrated with numeri cal and real data examples. The proposed graphical assessment of distortion assumptions is feasible due to the CAR estimation method, which utilizes a local regression technique to estimate a set of transformed distorting functions (S¸ent¨urk and Nguyen, 2006).
Abstract: Recently, count regression models have been used to model over dispersed and zero-inflated count response variable that is affected by one or more covariates. Generalized Poisson (GP) and negative binomial (NB) regression models have been suggested to deal with over-dispersion. Zero inflated count regression models such as the zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB) and zero-inflated generalized Pois son (ZIGP) regression models have been used to handle count data with many zeros. The aim of this study is to model the number of C. caretta hatchlings dying from exposure to the sun. We present an evaluation frame work to the suitability of applying the Poisson, NB, GP, ZIP and ZIGP to zoological data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood (ML) is provided. Based on the score test and the goodness of fit measure for zoological data, the GP regression model performs better than other count regression models.
Abstract: Epidemiological cohort study that adopts a two-phase design raises serious issue on how to treat a fairly large amount of missing val ues that are either Missing At Random (MAR) due to the study design or potentially Missing Not At Random (MNAR) due to non-response and loss to follow-up. Cognitive impairment (CI) is an evolving concept that needs epidemiological characterization for its maturity. In this work, we attempt to estimate the incidence rate CI by accounting for the aforemen tioned missing-data process. We consider baseline and first follow-up data of 2191 African-Americans enrolled in a prospective epidemiological study of dementia that adopted a two-phase sampling design. We developed a multiple imputation procedure in the mixture model framework that can be easily implemented in SAS. Sensitivity analysis is carried out to assess the dependence of the estimates on specific model assumptions. It is shown that African-Americans in the age of 65-75 have much higher incidence rate of CI than younger or older elderly. In conclusion, multiple imputation pro vides a practical and general framework for the estimation of epidemiological characteristics in two-phase sampling studies.
Abstract: We present power calculations for zero-inflated Poisson (ZIP) and zero-inflated negative-binomial (ZINB) models. We detail direct computations for a ZIP model based on a two-sample Wald test using the expected information matrix. We also demonstrate how Lyles, Lin, and Williamson’s method (2006) of power approximation for categorical and count outcomes can be extended to both zero-inflated models. This method can be used for power calculations based on the Wald test (via the observed information matrix) and the likelihood ratio test, and can accommodate both categorical and continuous covariates. All the power calculations can be conducted when covariates are used in the modeling of both the count data and the “excess zero” data, or in either part separately. We present simulations to detail the performance of the power calculations. Analysis of a malaria study is used for illustration.
Abstract: An individual in a finite population is represented by a random variable whose expectation is linearly composed of explanatory variables and a personal effect. This expectation locates her (his) random variable on a scale when s(he) responds to a questionnaire item or physical instrument. This formulation reinterprets design-based sampling, which represents an individual as a constant waiting to be observed. Retaining constant expecta tions , however, along with fixed realizations of random variables, preserves and strengthens design-based theory through the Horvitz-Thompson (1952) theorem. This interpretation reaffirms the usual design-based regression es timates, whose normality is seen to be free of any assumptions about the distribution of the outcome variable. It also formulates response error in a way that renders a superpopulation, postulated by model-based sampling, unnecessary. The value of distribution-free regression is illustrated with an analysis of American presidential approval.
Abstract: Foreign direct investment (FDI) has been traditionally considered an important channel in the diffusion of advanced technology. Whether it can promote technology progress for the host country is a focused problem. This paper analyzes the relationship between FDI and regional innovation capability (RIC). We find that the spillover effects of FDI are not as signif icant as it is usually thought. It is found that the impact of FDI on RIC is weak; the entry of FDI has no use for enhancing indigenous innovation capability. Moreover inward FDI might have the crowding-out effect on in novation and domestic R&D activity. The research manifests that increasing domestic R&D inputs, strengthening the innovation capabilities and absorp tive capacity in domestic enterprises are determinants to improve RIC.
Abstract: Comparative mathematical textbook analysis aims at the de termination of differences among countries concerning the development and transmission of mathematics. On the other hand, textual statistics provides a means to quantify a text by applying multivariate statistical techniques. So far this statistical approach has not been applied to comparative math ematical textbook analysis yet. The object of this paper is to quantify and compare the style of a number of textbooks on differential calculus writ ten in 18th century Europe. To that purpose two multivariate statistical techniques have been applied: 1) simple correspondence analysis and 2) hi erarchical clustering analysis. The results of both analysis help to detect some interesting associations among the analysed textbooks.