Abstract: Early phase clinical trials may not have a known variation (σ) for the response variable. In the light of applying t-test statistics, several procedures were proposed to use the information gained from stage-I (pilot study) to adaptively re estimate the sample size for managing the overall hypothesis test. We are interested in choosing a reasonable stage-I sample size (m) towards achieving an accountable overall sample size (stage-I and later). Conditional on any specified m, this paper replaces σ by the estimated σ (from stage-I with sample size m) to use the conventional formula under normal distribution assumption to re-estimate an overall sample size. The estimated σ, re-estimated overall sample size and the collective information (stage-I and later) would be incorporated into a surrogate normal variable which undergoes hypothesis test based on standard normal distribution. We plot the actual type I&II error rates and the expected sample size against m in order to choose a good universal stage-I sample size (𝑚∗ ) to start
Abstract: This paper is concerned with the change point analysis in a general class of distributions. The quasi-Bayes and likelihood ratio test procedures are considered to test the null hypothesis of no change point. Exact and asymptotic behaviors of the two test statistics are derived. To compare the performances of two test procedures, numerical significance levels and powers of tests are tabulated for certain selected values of the parameters. Estimation of the change point based on these two test procedures are also considered. Moreover, the epidemic change point problem is studied as an alternative model for the single change point model. A real data set with epidemic change model is analyzed by two test procedures.
Abstract: A statistical evaluation of the Baltimore County water well database is performed to gain insight on the sustainability of domestic supply wells in crystalline bedrock aquifers over the last 15 years. Variables potentially related to well yield that are considered included well construction, geol ogy, well depth, and static water level. A variety of statistical methods are utilized to assess correlation and significance from a database of approxi mately 8,500 wells, and a logistic regression model is developed to predict the probability of well failure by geology type. Results of a two-way analysis of variance technique indicate that the average well depth and yield are sta tistically different among the established geology groups, and between failed and non-failed wells. The static water level is shown to be statistically dif ferent among the geology groups but not among failed and non-failed wells. A logistic regression model results that well yield is the most influential vari able for predicting well failure. Static water level and well depth was not found to be significant in predicting well failure.
Abstract: Shared frailty models are often used to model heterogeneity in survival analysis. The most common shared frailty model is a model in which hazard function is a product of random factor (frailty) and baseline hazard function which is common to all individuals. There are certain as sumptions about the baseline distribution and distribution of frailty. Mostly assumption of gamma distribution is considered for frailty distribution. To compare the results with gamma frailty model, we introduce three shared frailty models with generalized exponential as baseline distribution. The other three shared frailty models are inverse Gaussian shared frailty model, compound Poisson shared frailty model and compound negative binomial shared frailty model. We fit these models to a real life bivariate survival data set of McGilchrist and Aisbett (1991) related to kidney infection using Markov Chain Monte Carlo (MCMC) technique. Model comparison is made using Bayesian model selection criteria and a better model is suggested for the data.
Brand Cluster is proposed based on the background of evolved consumption modes and concepts as well as brand preferences of different categories of consumers. With the support of inter-urban, inter-category and inter-brand big data, after deep learning and profound analysis of consumption relations of different brands, Brand Cluster was born to reflect characteristics of diverse consumers. We try to understand the inner features of 18 clusters of brands and how these clusters look like in different cities, which underlies the practice of city siting of brand owners. Brand Cluster is believed to reveal the relationships between “allies” of brands in a whole new angel of view and in the large. In addition, the make-up of brand clusters in different cities indicate whether a new city is appropriate for brand owners to expand into.
Abstract: A statistical approach, based on artificial neural networks, is pro posed for the post-calibration of weather radar rainfall estimation. Tested artificial neural networks include multilayer feedforward networks and radial basis functions. The multilayer feedforward training algorithms consisted of four variants of the gradient descent method, four variants of the conju gate gradient method, Quasi-Newton, One Step Secant, Resilient backprop agation, Levenberg-Marquardt method and Levenberg-Marquardt method using Bayesian regularization. The radial basis networks were the radial basis functions and the generalized regression networks. In general, results showed that the Levenberg-Marquardt algorithm using Bayesian regulariza tion can be introduced as a robust and reliable algorithm for post-calibration of weather radar rainfall estimation. This method benefits from the conver gence speed of the Levenberg-Marquardt algorithm and from the over fitting control of Bayes’ theorem. All the other multilayer feedforward training al gorithms result in failure since they often lead to over fitting or converged to a local minimum, which prevents them from generalizing the data. Radial basis networks are also problematic since they are very sensitive when used with sparse data.
Abstract: Change point problem has been studied extensively since 1950s due to its broad applications in many fields such as finance, biology and so on. As a special case of the multiple change point problem, the epidemic change point problem has received a lot of attention especially in medical studies. In this paper, a nonparametric method based on the empirical likelihood is proposed to detect the epidemic changes of the mean after unknown change points. Under some mild conditions, the asymptotic null distribution of the empirical likelihood ratio test statistic is proved to be the extreme distribution. The consistency of the test is also proved. Simulations indicate that the test behaves comparable to the other available tests while it enjoys less constraint on the data distribution. The method is applied to the Standford heart transplant data and detects the change points successfully.
Abstract: In this paper, we propose a flexible cure rate survival model by as suming that the number of competing causes of the event of interest follows the negative binomial distribution and the time to event follows a generalized gamma distribution. We define the negative binomial-generalized gamma distribution, which can be used to model survival data. The new model in cludes as special cases some of the well-known cure rate models discussed in the literature. We consider a frequentist analysis and nonparametric boot strap for parameter estimation of a negative binomial-generalized gamma regression model with cure rate. Then, we derive the appropriate matri ces for assessing local influence on the parameter estimates under different perturbation schemes and present some ways to perform global influence analysis. Finally, we analyze a real data set from the medical area.
A new four-parameter lifetime distribution named as the power Lomax Poisson is introduced and studied. The subject distribution is obtained by combining the power Lomax and Poisson distributions. Structural properties of the power Lomax Poisson model are implemented. Estimation of the model parameters are performed using the maximum likelihood, least squares and weighted least squares techniques. An intensive simulation study is performed for evaluating the performance of different estimators based on their relative biases, standard errors and mean square errors. Eventually, the superiority of the new compounding distribution over some existing distribution is illustrated by means of two real data sets. The results showed the fact that, the suggested model can produce better fits than some well-known distributions.