Although hypothesis testing has been misused and abused, we argue that it remains an important method of inference. Requiring preregistration of the details of the inferences planned for a study is a major step to preventing abuse. But when doing hypothesis testing, in practice the null hypothesis is almost always taken to be a “point null”, that is, a hypothesis that a parameter is equal to a constant. One reason for this is that it makes the required computations easier, but with modern computer power this is no longer a compelling justification. In this note we explore the interval null hypothesis that the parameter lies in a fixed interval. We consider a specific example in detail.
Abstract: Sample size and power calculations are often based on a two-group comparison. However, in some instances the group membership cannot be ascertained until after the sample has been collected. In this situation, the respective sizes of each group may not be the same as those prespecified due to binomial variability, which results in a difference in power from that expected. Here we suggest that investigators calculate an “expected power” taking into account the binomial variability of the group member ship, and adjust the sample size accordingly when planning such studies. We explore different scenarios where such an adjustment may or may not be necessary for both continuous and binary responses. In general, the number of additional subjects required depends only slightly on the values of the (standardized) difference in the two group means or proportions, but more importantly on the respective sizes of the group membership. We present tables with adjusted sample sizes for a variety of scenarios that can be readily used by investigators at the study design stage. The proposed approach is motivated by a genetic study of cerebral malaria and a sleep apnea study.
Abstract: Existing methods on sample size calculations for right-censored data largely assume the failure times follow exponential distribution or the Cox proportional hazards model. Methods under the additive hazards model are scarce. Motivated by a well known example of right-censored failure time data which the additive hazards model fits better than the Cox model, we proposed a method for power and sample size calculation for a two-group comparison assuming the additive hazards model. This model allows the investigator to specify a group difference in terms of a hazard difference and choose increasing, constant or decreasing baseline hazards. The power computation is based on the Wald test. Extensive simulation studies are performed to demonstrate the performance of the proposed approach. Our simulation also shows substantially decreased power if the additive hazards models is misspecified as the Cox proportional hazards model.