Abstract: A core task in analyzing randomized clinical trials based on longitudinal data is to find the best way to describe the change over time for each treatment arm. We review the implementation and estimation of a flexible piecewise Hierarchical Linear Model (HLM) to model change over time. The flexible piecewise HLM consists of two phases with differing rates of change. The breakpoints between these two phases, as well as the rates of change per phase are allowed to vary between treatment groups as well as individuals. While this approach may provide better model fit, how to quantify treatment differences over the longitudinal period is not clear. In this paper, we develop a procedure for summarizing the longitudinal data for the flexible piecewise HLM on the lines of Cook et al. (2004). We focus on quantifying the overall treatment efficacy using the area under the curve (AUC) of the individual flexible piecewise HLM models. Methods are illustrated through data from a placebo-controlled trial in the treatment of depression comparing psychotherapy and pharmacotherapy.
Abstract: The aim of this paper is to identify the effects of socioeconomic factors and family planning program effort on total fertility rate with national level data from forty-three developing countries. The data used have mainly been taken from the secondary source “Family Planning and Child Survival: 100 Developing Countries” compiled by the Center for Population and Family Health, Columbia University. Because the independent variables were found to be highly correlated among themselves, component regression technique has been used to analyze the data. The analysis shows that the family planning program effort has the largest contribution in lowering the total fertility rate, followed by percent of urban population, female literacy rate, and infant mortality rate in that order. Policy implications are discussed.
The so-called Kumaraswamy distribution is a special probability distribution developed to model doubled bounded random processes for which the mode do not necessarily have to be within the bounds. In this article, a generalization of the Kumaraswamy distribution called the T-Kumaraswamy family is defined using the T-R {Y} family of distributions framework. The resulting T-Kumaraswamy family is obtained using the quantile functions of some standardized distributions. Some general mathematical properties of the new family are studied. Five new generalized Kumaraswamy distributions are proposed using the T-Kumaraswamy method. Real data sets are further used to test the applicability of the new family.
Compositional data are positive multivariate data, constrained to lie within the simplex space. Regression analysis of such data has been studied and many regression models have been proposed, but most of them not allowing for zero values. Secondly, the case of compositional data being in the predictor variables side has gained little research interest. Surprisingly enough, the case of both the response and predictor variables being compositional data has not been widely studied. This paper suggests a solution for this last problem. Principal components regression using the 𝛼 -transformation and Kulback-Leibler divergence are the key elements of the proposed approach. An advantage of this approach is that zero values are allowed, in both the response and the predictor variables side. Simulation studies and examples with real data illustrate the performance of our algorithm.
Abstract: Simultaneous tests of a huge number of hypotheses is a core issue in high flow experimental methods such as microarray for transcriptomic data. In the central debate about the type I error rate, Benjamini and Hochberg (1995) have proposed a procedure that is shown to control the now popular False Discovery Rate (FDR) under assumption of independence between the test statistics. These results have been extended to a larger class of dependency by Benjamini and Yekutieli (2001) and improvements have emerged in recent years, among which step-up procedures have shown desirable properties. The present paper focuses on the type II error rate. The proposed method improves the power by means of double-sampling test statistics in tegrating external information available both on the sample for which the outcomes are measured and also on additional items. The small sample dis tribution of the test statistics is provided and simulation studies are used to show the beneficial impact of introducing relevant covariates in the testing strategy. Finally, the present method is implemented in a situation where microarray data are used to select the genes that affect the degree of muscle destructuration in pigs. A phenotypic covariate is introduced in the analysis to improve the search for differentially expressed genes.
Abstract: We consider the problem of estimating the properties of an oil reservoir, like porosity and sand thickness, in an exploration scenario where only a few wells have been drilled. We use gamma ray records measured directly from the wells as well as seismic traces recorded around the wells. To model the association between the soil properties and the signals, we fit a linear regression model. Additionally we account for the spatial correla tion structure of the observations using a correlation function that depends on the distance between two points. We transform the predictor variable using discrete wavelets and then perform a Bayesian variable selection us ing a Metropolis search. We obtain predictions of the properties over the whole reservoir providing a probabilistic quantification of their uncertainties, thanks to the Bayesian nature of our method. The cross-validated results show that a very high accuracy can be achieved even with a very small number of wavelet coefficients.
Abstract: This paper aims to generate multivariate random vector with prescribed correlation matrix by Johnson system. The probability weighted moment (PWM) is employed to assess the parameters of Johnson system. By equat ing the first four PWMs of Johnson system with those of the target distri bution, a system of equations solved for the parameters is established. With suitable initial values, solutions to the equations are obtained by the New ton iteration procedure. To allow for the generation of random vector with prescribed correlation matrix, approaches to accommodate the dependency are put forward. For the four transformation models of Johnson system, nine cases are addressed. Analytical formulae are derived to determine the equivalent correlation coefficient in the standard normal space for six cases, the rest three ones are handled by an interpolation method. Finally, several numerical examples are given out to check the proposed method.
Abstract: The chi-squared test for independence in two-way categorical tables depends on the assumptions that the data follow the multinomial distribution. Thus, we suggest alternatives when the assumptions of multi nomial distribution do not hold. First, we consider the Bayes factor which is used for hypothesis testing in Bayesian statistics. Unfortunately, this has the problem that it is sensitive to the choice of prior distributions. We note here that the intrinsic Bayes factor is not appropriate because the prior distribu tions under consideration are all proper. Thus, we propose using Bayesian estimation which is generally not as sensitive to prior specifications as the Bayes factor. Our approach is to construct a 95% simultaneous credible re gion (i.e., a hyper-rectangle) for the interactions. A test that all interactions are zero is equivalent to a test of independence in two-way categorical tables. Thus, a 95% simultaneous credible region of the interactions provides a test of independence by inversion.
In this paper, we propose a new generalization of exponentiated modified Weibull distribution, called the McDonald exponentiated modified Weibull distribution. The new distribution has a large number of well-known lifetime special sub-models such as the McDonald exponentiated Weibull, beta exponentiated Weibull, exponentiated Weibull, exponentiated expo- nential, linear exponential distribution, generalized Rayleigh, among others. Some structural properties of the new distribution are studied. Moreover, we discuss the method of maximum likelihood for estimating the model parameters.