The Pareto distribution is a power law probability distribution that is used to describe social scientific, geophysical, actuarial, and many other types of observable phenomena. A new weighted Pareto distribution is proposed using a logarithmic weight function. Several statistical properties of the weighted Pareto distribution are studied and derived including cumulative distribution function, location measures such as mode, median and mean, reliability measures such as reliability function, hazard and reversed hazard functions and the mean residual life, moments, shape indices such as skewness and kurtosis coefficients and order statistics. A parametric estimation is performed to obtain estimators for the distribution parameters using three different estimation methods the maximum likelihood method, the L-moments method and the method of moments. Numerical simulation is carried out to validate the robustness of the proposed distribution. The distribution is fitted to a real data set to show its importance in real life applications.
The Weibull distribution due to its suitability to adequately model data with high degree of positive skewness which is a typical characteristics of the claim amounts, is considered a versatile model for loss modeling in general Insurance. In this paper, the Weibull distribution is fitted to a set of insurance claim data and the probability of ultimate ruin has been computed for Weibull distributed claim data using two methods, namely the Fast Fourier Transform and the 4 moment Gamma De Vylder approximation. The consistency has been found in the values obtained from the both the methods. For the same model, the first two moments of the time to ruin, deficit at the time of ruin and the surplus just prior to ruin have been computed numerically. The moments are found to be exhibiting behavior consistent to what is expected in practical scenario. The influence of the surplus process being subjected to the force of interest earnings and tax payments on the probability of ultimate ruin, causes the later to be higher than what is obtained in the absence of these factors.
Abstract: Comparison of more than two diagnostic or screening tests for prediction of presence vs. absence of a disease or condition can be com plicated when attempting to simultaneously optimize a pair of competing criteria such as sensitivity and specificity. A technique for quantifying rel ative superiority of a diagnostic test when a gold standard exists in this setting is described. The proposed superiority index is used to quantify and rank performance of diagnostic tests and combinations of tests. Develop ment of a validated model containing a subset of the tests may be improved by eliminating tests having a very small value for this index. To illustrate, we present an example using a large battery of neuropsychological tests for prediction of cognitive impairment. Using the proposed index, the battery is reduced with favorable results.
Abstract: Analysis of footprint data is important in the tire industry. Estimation procedures for multiple change points and unknown parameters in a segmented regression model with unknown heteroscedastic variances are developed for analyzing such data. Our approaches include both likelihood and Bayesian, with and without continuity constraints at the change points. A model selection procedure is also proposed to choose among competing models for fitting a middle segment of the data between change points. We study the performance of the two approaches and apply them to actual tire data examples. Our Maximization–Maximization–Posterior (MMP) algorithm and the likelihood–based estimation are found to be complimentary to each other.
Abstract: Two methods for clustering data and choosing a mixture model are proposed. First, we derive a new classification algorithm based on the classification likelihood. Then, the likelihood conditional on these clusters is written as the product of likelihoods of each cluster, and AIC- respectively BIC-type approximations are applied. The resulting criteria turn out to be the sum of the AIC or BIC relative to each cluster plus an entropy term. The performance of our methods is evaluated by Monte-Carlo methods and on a real data set, showing in particular that the iterative estimation algorithm converges quickly in general, and thus the computational load is rather low.
Abstract: Multiple binary outcomes that measure the presence or absence of medical conditions occur frequently in public health survey research. The multiple possibly correlated binary outcomes may compose of a syndrome or a group of related diseases. It is often of scientific interest to model the interrelationships not only between outcome and risk factors, but also between different outcomes. Applied and practical methods dealing with multiple outcomes from complex designed surveys are lacking. We propose a multivariate approach based on the generalized estimating equation (GEE) methodology to simultaneously conduct survey logistic regressions for each binary outcome in a single analysis. The approach has the following attrac tive features: 1) It enables modeling the complete information from multiple outcomes in a single analysis; 2) it permits to test the correlations between multiple binary outcomes; 3) it allows of discerning the outcome-specific ef fect and the overall risk factor effect; and 4) it provides the measurement of difference of the association between risk factors and multiple outcomes. The proposed method is applied to a study on risk factors for heart attack and stroke in 2009 U.S. nationwide Behavioral Risk Factor Surveillance System (BRFSS) data.
Abstract: The present paper deals with the maximum likelihood and Bayes estimation procedure for the shape and scale parameter of Poisson-exponential distribution for complete sample. Bayes estimators under symmetric and asymmetric loss function are obtained using Markov Chain Monte Carlo (MCMC) technique. Performances of the proposed Bayes estimators have been studied and compared with their maximum likelihood estimators on the basis of Monte Carlo study of simulated samples in terms of their risks. The methodology is also illustrated on a real data set.