Abstract: Trials for comparing interventions where cluster of subjects, rather than individuals, are randomized, are commonly called cluster randomized trials (CRTs). For comparison of binary outcomes in a CRT, although there are a few published formulations for sample size computation, the most commonly used is the one developed by Donner, Birkett, and Buck (Am J Epidemiol, 1981) probably due to its incorporation in the text book by Fleiss, Levin, and Paik (Wiley, 2003). In this paper, we derive a new χ 2 approximation formula with a general continuity correction factor (c) and show that specially for the scenarios of small event rates (< 0.01), the new formulation recommends lower number of clusters than the Donner et al. formulation thereby providing better efficiency. All known formulations can be shown to be special cases at specific value of the general correction factor (e.g., Donner formulation is equivalent to the new formulation for c = 1). Statistical simulation is presented with data on comparative efficacy of the available methods identifying correction factors that are optimal for rare event rates. Table of sample size recommendation for variety of rare event rates along with code in“R” language for easy computation of sample size in other settings is also provided. Sample size calculations for a published CRT (“Pathways to Health study” that evaluates the value of intervention for smoking cessation) are computed for various correction factors to illustrate that with an optimal choice of the correction factor, the study could have maintained the same power with a 20% less sample size.
Abstract: This paper is motivated by an investigation into the growth of pigs, which studied among other things the effect of short–term feed with drawal on live weight. This treatment was thought to reduce the variability in the weights of the pigs. We represent this reduction as an attenuation in an animal–specific random effect. Given data on each pig before and after treatment, we consider the problems of testing for a treatment effect and measuring the strength of the effect, if significant. These problems are related to those of testing the homogeneity of correlated variances, and re gression with errors in variables. We compare three different estimates of the attenuation factor using data on the live weights of pigs, and by simulation.
Abstract: This article presents and illustrates several important subset design approaches for Gaussian nonlinear regression models and for linear models where interest lies in a nonlinear function of the model parameters. These design strategies are particularly useful in situations where currentlyused subset design procedures fail to provide designs which can be used to fit the model function. Our original design technique is illustrated in conjuction with D-optimality, Bayesian D-optimality and Kiefer’s Φk-optimality, and is extended to yield subset designs which take account of curvature.
Abstract: We propose two simple, easy-to-implement methods for obtaining simultaneous credible bands in hierarchical models from standard Markov chain Monte Carlo output. The methods generalize Scheff´e’s (1953) approach to this problem, but in a Bayesian context. A small simulation study is followed by an application of the methods to a seasonal model for Ache honey gathering.
Abstract: The assessment of modality or “bumps” in distributions is of in terest to scientists in many areas. We compare the performance of four statistical methods to test for departures from unimodality in simulations, and further illustrate the four methods using well-known ecological datasets on body mass published by Holling in 1992 to illustrate their advantages and disadvantages. Silverman’s kernel density method was found to be very conservative. The excess mass test and a Bayesian mixture model approach showed agreement among the data sets, whereas Hall and York’s test pro vided strong evidence for the existence of two or more modes in all data sets. The Bayesian mixture model also provided a way to quantify the un certainty associated with the number of modes. This work demonstrates the inherent richness of animal body mass distributions but also the difficulties for characterizing it, and ultimately understanding the processes underlying them.
Abstract: In modeling and analyzing multivariate data, the conventionally used measure of dependence structure is the Pearson’s correlation coeffi cient. However use of the correlation as a dependence measure has several pitfalls. Copulas recently have emerged as an alternative measure of the de pendence, overcoming most of the drawbacks of the correlation. We discuss Archimedean copulas and their relationships with tail dependence. An algo rithm to construct empirical and Archimedean copulas is described. Monte Carlo simulations are carried out to replicate and analyze data sets by iden tifying the appropriate copula. We apply the Archimedean copula based methodology to assess the accuracy of Doppler echocardiography in deter mining aortic valve area from the Aortic Stenosis: Simultaneous Doppler – Catheter Correlative study carried out at the King Faisal Specialist Hospital and Research Centre, Riyadh, KSA
Abstract: Missing values are not uncommon in longitudinal data studies. Missingness could be due to withdrawal from the study (dropout) or intermittent. The missing data mechanism is termed non-ignorable if the probability of missingness depends on the unobserved (missing) observations. This paper presents a model for continuous longitudinal data with non-ignorable non-monotone missing values. Two separate models, for the response and missingness, are assumed. The response is modeled as multivariate nor mal whereas the binomial model for missingness process. Parameters in the adopted model are estimated using the stochastic EM algorithm. The proposed model (approach) is then applied to an example from the International Breast Cancer Study Group.
Abstract: This paper estimates the interest rate term structures of Treasury and individual corporate bonds using a robust criterion. The Treasury term structure is estimated with Bayesian regression splines based on nonlinear least absolute deviation. The number and locations of the knots in the regression splines are adaptively chosen using the reversible jump Markov chain Monte Carlo method. Due to the small sample size, the individual corporate term structure is estimated by adding a positive parametric credit spread to the estimated Treasury term structure using a Bayesian approach. We present a case study of U.S. Treasury STRIPS (Separate Trading of Registered Interest and Principal of Securities) and AT&T bonds from April 1994 to December 1996. Compared with several existing term structure estimation approaches, the proposed method is robust to outliers in our case study.
Abstract: In this paper, we introduce an extended four-parameter Fr´echet model called the exponentiated exponential Fr´echet distribution, which arises from the quantile function of the standard exponential distribution. Various of its mathematical properties are derived including the quantile function, ordinary and incomplete moments, Bonferroni and Lorenz curves, mean deviations, mean residual life, mean waiting time, generating function, Shannon entropy and order statistics. The model parameters are estimated by the method of maximum likelihood and the observed information matrix is determined. The usefulness of the new distribution is illustrated by means of three real lifetime data sets. In fact, the new model provides a better fit to these data than the Marshall-Olkin Fr´echet, exponentiated-Fr´echet and Fr´echet models.