In the linear regression setting, we propose a general framework, termed weighted orthogonal components regression (WOCR), which encompasses many known methods as special cases, including ridge regression and principal components regression. WOCR makes use of the monotonicity inherent in orthogonal components to parameterize the weight function. The formulation allows for efficient determination of tuning parameters and hence is computationally advantageous. Moreover, WOCR offers insights for deriving new better variants. Specifically, we advocate assigning weights to components based on their correlations with the response, which may lead to enhanced predictive performance. Both simulated studies and real data examples are provided to assess and illustrate the advantages of the proposed methods.
Abstract: In this paper we consider clinical trials with two treatments and a non-normally distributed response variable. In addition, we focus on ap plications which include only discrete covariates and their interactions. For such applications, the semi-parametric Area Under the ROC Curve (AUC) regression model proposed by Dodd and Pepe (2003) can be used. However, because a logistic regression procedure is used to obtain parameter estimates and a bootstrapping method is needed for computing parameter standard errors, their method may be cumbersome to implement. In this paper we propose to use a set of AUC estimates to obtain parameter estimates and combine DeLong’s method and the delta method for computing parameter standard errors. Our new method avoids heavy computation associated with the Dodd and Pepe’s method and hence is easy to implement. We conduct simulation studies to show that the two methods yield similar results. Finally, we illustrate our new method using data from urinary incontinence clinical trials.
bstract: In this article we propose further extension of the generalized Marshall Olkin-G ( GMO - G ) family of distribution. The density and survival functions are expressed as infinite mixture of the GMO - G distribution. Asymptotes, Rényi entropy, order statistics, probability weighted moments, moment generating function, quantile function, median, random sample generation and parameter estimation are investigated. Selected distributions from the proposed family are compared with those from four sub models of the family as well as with some other recently proposed models by considering real life data fitting applications. In all cases the distributions from the proposed family out on top.
Abstract: Two methods for clustering data and choosing a mixture model are proposed. First, we derive a new classification algorithm based on the classification likelihood. Then, the likelihood conditional on these clusters is written as the product of likelihoods of each cluster, and AIC- respectively BIC-type approximations are applied. The resulting criteria turn out to be the sum of the AIC or BIC relative to each cluster plus an entropy term. The performance of our methods is evaluated by Monte-Carlo methods and on a real data set, showing in particular that the iterative estimation algorithm converges quickly in general, and thus the computational load is rather low.
Abstract: For model selection in mixed effects models, Vaida and Blan chard (2005) demonstrated that the marginal Akaike information criterion is appropriate as to the questions regarding the population and the conditional Akaike information criterion is appropriate as to the questions regarding the particular clusters in the data. This article shows that the marginal Akaike information criterion is asymptotically equivalent to the leave-one-cluster-out cross-validation and the conditional Akaike information criterion is asymptotically equivalent to the leave-one-observation-out cross-validation.