Abstract: This paper provides a Bayesian approach to estimating the interest rate term structures of Treasury and corporate debt with a penalized spline model. Although the literature on term structure modeling is vast, to the best of our knowledge, all methods developed so far belong to the frequentist school. In this paper, we develop a two-step estimation procedure from a Bayesian perspective. The Treasury term structure is first estimated with a Bayesian penalized spline model. The smoothing parameter is naturally embedded in the model as a ratio of posterior variances and does not need to be selected as in the frequentist approach. The corporate term structure is then estimated by adding a credit spread to the estimated Treasury term structure, incorporating knowledge of the positive credit spread into the Bayesian model as an informative prior. In contrast to the frequentist method, the small sample size of the corporate debt poses no particular difficulty to the proposed Bayesian approach.
Compositional data consist of known compositions vectors whose components are positive and defined in the interval (0,1) representing proportions or fractions of a “whole”. The sum of these components must be equal to one. Compositional data is present in different knowledge areas, as in geology, economy, medicine among many others. In this paper, we propose a new statistical tool for volleyball data, i.e., we introduce a Bayesian anal- ysis for compositional regression applying additive log-ratio (ALR) trans- formation and assuming uncorrelated and correlated errors. The Bayesian inference procedure based on Markov Chain Monte Carlo Methods (MCMC). The methodology is applied on an artificial and a real data set of volleyball.
The Birnbaum-Saunders generalized t (BSGT) distribution is a very flflexible family of distributions that admits different degrees of skewness and kurtosis and includes some important special or limiting cases available in the literature, such as the Birnbaum-Saunders and BirnbaumSaunders t distributions. In this paper we provide a regression type model to the BSGT distribution based on the generalized additive models for location, scale and shape (GAMLSS) framework. The resulting model has high flflexibility and therefore a great potential to model the distribution parameters of response variables that present light or heavy tails, i.e. platykurtic or leptokurtic shapes, as functions of explanatory variables. For different parameter settings, some simulations are performed to investigate the behavior of the estimators. The potentiality of the new regression model is illustrated by means of a real motor vehicle insurance data set.
Abstract: A core task in analyzing randomized clinical trials based on longitudinal data is to find the best way to describe the change over time for each treatment arm. We review the implementation and estimation of a flexible piecewise Hierarchical Linear Model (HLM) to model change over time. The flexible piecewise HLM consists of two phases with differing rates of change. The breakpoints between these two phases, as well as the rates of change per phase are allowed to vary between treatment groups as well as individuals. While this approach may provide better model fit, how to quantify treatment differences over the longitudinal period is not clear. In this paper, we develop a procedure for summarizing the longitudinal data for the flexible piecewise HLM on the lines of Cook et al. (2004). We focus on quantifying the overall treatment efficacy using the area under the curve (AUC) of the individual flexible piecewise HLM models. Methods are illustrated through data from a placebo-controlled trial in the treatment of depression comparing psychotherapy and pharmacotherapy.
Abstract: The aim of this paper is to identify the effects of socioeconomic factors and family planning program effort on total fertility rate with national level data from forty-three developing countries. The data used have mainly been taken from the secondary source “Family Planning and Child Survival: 100 Developing Countries” compiled by the Center for Population and Family Health, Columbia University. Because the independent variables were found to be highly correlated among themselves, component regression technique has been used to analyze the data. The analysis shows that the family planning program effort has the largest contribution in lowering the total fertility rate, followed by percent of urban population, female literacy rate, and infant mortality rate in that order. Policy implications are discussed.
The so-called Kumaraswamy distribution is a special probability distribution developed to model doubled bounded random processes for which the mode do not necessarily have to be within the bounds. In this article, a generalization of the Kumaraswamy distribution called the T-Kumaraswamy family is defined using the T-R {Y} family of distributions framework. The resulting T-Kumaraswamy family is obtained using the quantile functions of some standardized distributions. Some general mathematical properties of the new family are studied. Five new generalized Kumaraswamy distributions are proposed using the T-Kumaraswamy method. Real data sets are further used to test the applicability of the new family.
Compositional data are positive multivariate data, constrained to lie within the simplex space. Regression analysis of such data has been studied and many regression models have been proposed, but most of them not allowing for zero values. Secondly, the case of compositional data being in the predictor variables side has gained little research interest. Surprisingly enough, the case of both the response and predictor variables being compositional data has not been widely studied. This paper suggests a solution for this last problem. Principal components regression using the 𝛼 -transformation and Kulback-Leibler divergence are the key elements of the proposed approach. An advantage of this approach is that zero values are allowed, in both the response and the predictor variables side. Simulation studies and examples with real data illustrate the performance of our algorithm.
Abstract: Simultaneous tests of a huge number of hypotheses is a core issue in high flow experimental methods such as microarray for transcriptomic data. In the central debate about the type I error rate, Benjamini and Hochberg (1995) have proposed a procedure that is shown to control the now popular False Discovery Rate (FDR) under assumption of independence between the test statistics. These results have been extended to a larger class of dependency by Benjamini and Yekutieli (2001) and improvements have emerged in recent years, among which step-up procedures have shown desirable properties. The present paper focuses on the type II error rate. The proposed method improves the power by means of double-sampling test statistics in tegrating external information available both on the sample for which the outcomes are measured and also on additional items. The small sample dis tribution of the test statistics is provided and simulation studies are used to show the beneficial impact of introducing relevant covariates in the testing strategy. Finally, the present method is implemented in a situation where microarray data are used to select the genes that affect the degree of muscle destructuration in pigs. A phenotypic covariate is introduced in the analysis to improve the search for differentially expressed genes.
Abstract: We consider the problem of estimating the properties of an oil reservoir, like porosity and sand thickness, in an exploration scenario where only a few wells have been drilled. We use gamma ray records measured directly from the wells as well as seismic traces recorded around the wells. To model the association between the soil properties and the signals, we fit a linear regression model. Additionally we account for the spatial correla tion structure of the observations using a correlation function that depends on the distance between two points. We transform the predictor variable using discrete wavelets and then perform a Bayesian variable selection us ing a Metropolis search. We obtain predictions of the properties over the whole reservoir providing a probabilistic quantification of their uncertainties, thanks to the Bayesian nature of our method. The cross-validated results show that a very high accuracy can be achieved even with a very small number of wavelet coefficients.