Abstract: Affymetrix high-density oligonucleotide microarray makes it possible to simultaneously measure, and thus compare the expression profiles of hundreds of thousands of genes in living cells. Genes differentially expressed in different conditions are very important to both basic and medical research. However, before detecting these differentially expressed genes from a vast number of candidates, it is necessary to normalize the microarray data due to the significant variation caused by non-biological factors. During the last few years, normalization methods based on probe level or probeset level intensities were proposed in the literature. These methods were motivated by different purposes. In this paper, we propose a multivariate normalization method, based on partial least squares regression, aiming to equalize the central tendency, reduce and equalize the variation of the probe level intensities in any probeset across the replicated arrays. By so doing, we hope that one can precisely estimate the gene expression indexes.
Abstract: Design-based regression regards the survey response as a constant waiting to be observed. Bechtel (2007) replaced this constant with the sum of a fixed true value and a random measurement error. The present paper relaxes the assumption that the expected error is zero within a survey respondent. It also allows measurement errors in predictor variables as well as in the response variable. Reasonable assumptions about these errors over respondents, along with coefficient alpha in psychological test theory, enable the regression of true responses on true predictors. This resolves two major issues in survey regression, i.e. errors in variables and item non-response. The usefulness of this resolution is demonstrated with three large datasets collected by the European Social Survey in 2002, 2004 and 2006. The paper concludes with implications of true-value regression for survey theory and practice and for surveying large world populations.
Abstract: The present article discusses and compares multiple testing procedures (MTPs) for controlling the family wise error rate. Machekano and Hubbard (2006) have proposed empirical Bayes approach that is a resampling based multiple testing procedure asymptotically controlling the familywise error rate. In this paper we provide some additional work on their procedure, and we develop resampling based step-down procedure asymptotically controlling the familywise error rate for testing the families of one-sided hypotheses. We apply these procedures for making successive comparisons between the treatment effects under a simple-order assumption. For example, the treatment means may be a sequences of increasing dose levels of a drug. Using simulations, we demonstrate that the proposed step-down procedure is less conservative than the Machekano and Hubbard’s procedure. The application of the procedure is illustrated with an example.
Abstract: This paper considers the statistical problems of editing and imputing data of multiple time series generated by repetitive surveys. The case under study is that of the Survey of Cattle Slaughter in Mexico’s Municipal Abattoirs. The proposed procedure consists of two phases; firstly the data of each abattoir are edited to correct them for gross inconsistencies. Secondly, the missing data are imputed by means of restricted forecasting. This method uses all the historical and current information available for the abattoir, as well as multiple time series models from which efficient estimates of the missing data are obtained. Some empirical examples are shown to illustrate the usefulness of the method in practice.
Abstract: Good inference for the random effects in a linear mixed-effects model is important because of their role in decision making. For example, estimates of the random effects may be used to make decisions about the quality of medical providers such as hospitals, surgeons, etc. Standard methods assume that the random effects are normally distributed, but this may be problematic because inferences are sensitive to this assumption and to the composition of the study sample. We investigate whether using a Dirichlet process prior instead of a normal prior for the random effects is effective in reducing the dependence of inferences on the study sample. Specifically, we compare the two models, normal and Dirichlet process, emphasizing inferences for extrema. Our main finding is that using the Dirichlet process prior provides inferences that are substantially more robust to the composition of the study sample.
Abstract: The aim of this paper is to investigate the flexibility of the skewnormal distribution to classify the pixels of a remotely sensed satellite image. In the most of remote sensing packages, for example ENVI and ERDAS, it is assumed that populations are distributed as a multivariate normal. Then linear discriminant function (LDF) or quadratic discriminant function (QDF) is used to classify the pixels, when the covariance matrix of populations are assumed equal or unequal, respectively. However, the data was obtained from the satellite or airplane images suffer from non-normality. In this case, skew-normal discriminant function (SDF) is one of techniques to obtain more accurate image. In this study, we compare the SDF with LDF and QDF using simulation for different scenarios. The results show that ignoring the skewness of the data increases the misclassification probability and consequently we get wrong image. An application is provided to identify the effect of wrong assumptions on the image accuracy.
Abstract: The assumption that is usually made when modeling count data is that the response variable, which is the count, is correctly reported. Some counts might be over- or under-reported. We derive the Generalized PoissonPoisson mixture regression (GPPMR) model that can handle accurate, underreported and overreported counts. The parameters in the model will be estimated via the maximum likelihood method. We apply the GPPMR model to a real-life data set.
Abstract: This paper studies the affect the tax environment has on health care coverage of individuals. This study adds to the current literature of health care policy by examining how individuals switch types of health care coverage given a change in the tax environment. The distribution of health care coverage will be investigated using transition matrices. Then, a model is used to determine how the individuals might be expected to switch insurance types given a change in the tax environment. Based on the results of this study, the authors give some recommendations on what the implications of the results may mean to health care policy makers.
Abstract: Li and Tiwari (2008) recently developed a corrected Z-test statistic for comparing the trends in cancer age-adjusted mortality and incidence rates across overlapping geographic regions, by properly adjusting for the correlation between the slopes of the fitted simple linear regression equations. One of their key assumptions is that the error variances have unknown but common variance. However, since the age-adjusted rates are linear combinations of mortality or incidence counts, arising naturally from an underlying Poisson process, this constant variance assumption may be violated. This paper develops a weighted-least-squares based test that incorporates heteroscedastic error variances, and thus significantly extends the work of Li and Tiwari. The proposed test generally outperforms the aforementioned test through simulations and through application to the age-adjusted mortality data from the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute.