Abstract: In the area of survival analysis the most popular regression model is the Cox proportional hazards (PH) model. Unfortunately, in practice not all data sets satisfy the PH condition and thus the PH model cannot be used. To overcome the problem, the proportional odds (PO) model ( Pettitt 1982 and Bennett 1983a) and the generalized proportional odds (GPO) model ( Dabrowska and Doksum, 1988) were proposed, which can be considered in some sense generalizations of the PH model. However, there are examples indicating that the use of the PO or GPO model is not appropriate. As a consequence, a more general model must be considered. In this paper, a new model, called the proportional generalized odds (PGO) model, is introduced, which covers PO and GPO models as special cases. Estimation of the regression parameters as well as the underlying survival function of the GPO model is discussed. An application of the model to a data set is presented.
Abstract: In this paper, a new class of five parameter gamma-exponentiated or generalized modified Weibull (GEMW) distribution which includes exponential, Rayleigh, Weibull, modified Weibull, exponentiated Weibull, exponentiated exponential, exponentiated modified Weibull, exponentiated modified exponential, gamma-exponentiated exponential, gamma exponentiated Rayleigh, gamma-modified Weibull, gamma-modified exponential, gamma-Weibull, gamma-Rayleigh and gamma-exponential distributions as special cases is proposed and studied. Mathematical properties of this new class of distributions including moments, mean deviations, Bonferroni and Lorenz curves, distribution of order statistics and Renyi entropy are presented. Maximum likelihood estimation technique is used to estimate the model parameters and applications to real data sets presented in order to illustrate the usefulness of this new class of distributions and its sub-models.
Abstract: The concept of frailty provides a suitable way to introduce random effects in the model to account for association and unobserved heterogeneity. In its simplest form, a frailty is an unobserved random factor that modifies multiplicatively the hazard function of an individual or a group or cluster of individuals. In this paper, we study positive stable distribution as frailty distribution and two different baseline distributions namely Pareto and linear failure rate distribution. We estimate parameters of proposed models by introducing Bayesian estimation procedure using Markov Chain Monte Carlo (MCMC) technique. In the present study a simulation is done to compare the true values of parameters with the estimated value. We try to fit the proposed models to a real life bivariate survival data set of McGrilchrist and Aisbett (1991) related to kidney infection. Also, we present a comparison study for the same data by using model selection criterion, and suggest a better model.
Abstract: In this study, we compared various block bootstrap methods in terms of parameter estimation, biases and mean squared errors (MSE) of the bootstrap estimators. Comparison is based on four real-world examples and an extensive simulation study with various sample sizes, parameters and block lengths. Our results reveal that ordered and sufficient ordered non-overlapping block bootstrap methods proposed by Beyaztas et al. (2016) provide better results in terms of parameter estimation and its MSE compared to conventional methods. Also, sufficient non-overlapping block bootstrap method and its ordered version have the smallest MSE for the sample mean among the others.
Abstract: Breast cancer is the second most common type of cancer in the world (World Cancer Report, 2014 a, b). The evolution of breast cancer treatment usually allows a longer life of patients as well in many cases a relapse of the disease. Usually medical researchers are interested to analyze data denoting the time until the occurrence of an event of interest such as the time of death by cancer in presence of right censored data and some covariates. In some situations, we could have two lifetimes associated to the same patient, as for example, the time free of the disease until recurrence and the total lifetime of the patient. In this case, it is important to assume a bivariate lifetime distribution which describes the possible dependence between the two observations. We consider as an application, different parametric bivariate lifetime distributions to analyze a breast cancer data set considering continuous or discrete data. Inferences of interest are obtained under a statistical Bayesian approach. We get the posterior summaries of interest using existing MCMC (Markov Chain Monte Carlo) methods. The main goal of the study, is to compare the bivariate continuous and discrete distributions that better describes the breast cancer lifetimes.
Abstract: Suppose that an order restriction is imposed among several means in time series. We are interested in testing the homogeneity of these unknown means under this restriction. In the present paper, a test based on the isotonic regression is done for monotonic ordered means in time series with stationary process and short range dependent sequences errors. A test statistic is proposed using the penalized likelihood ratio (PLR) approach. Since the asymptotic null distribution of test statistic is complicated, its critical values are computed by using Monte Carlo simulation method for some values of sample sizes at different significance levels. The power study of our test statistic is provided which is more powerful than that of the test proposed by Brillinger (1989). Finally, to show the application of the proposed test, it is applied to real dataset contains monthly Iran rainfall records.
Abstract: Simulation studies are important statistical tools used to inves-tigate the performance, properties and adequacy of statistical models. The simulation of right censored time-to-event data involves the generation of two independent survival distributions, where the rst distribution repre-sents the uncensored survival times and the second distribution represents the censoring mechanism. In this brief report we discuss how we can make it so that the percentage of censored data is previously de ned. The described method was used to generate data from a Weibull distribution, but it can be adapted to any other lifetime distribution. We further presented an R code function for generating random samples, considering the proposed approach.
Abstract: A powerful methodology for exploring relationships among items, association rules analysis can be used to capture a set of rules from any given dataset. Little is known, however, that a single dataset can be represented by more than one set of rules, i.e., by equivalent models. In fact, most studies on the goodness of model can be misleading because they assume the model is unique. These are phenomenon that the literature has yet to explore. In our study, we demonstrate that equivalent models exist for any dataset and propose a method for converting any given model into its dominant model, recommended as the benchmark model. Further, we explain how the phenomenon of equivalent models affects decision tree analysis and statistical model selection. It is shown that the decision rules from decision tree analysis can always be simplified by reducing the decision rules to the dominant model. The simulated and real datasets are used for illustration.
Abstract. Unemployment is one of the most important issues in macro economics. Unemployment creates many economic and social problems in the economy. The condition and qualification of labor force in a country show economical developments. In the light of these facts, a developing country should overcome the problem of unemployment. In this study, the performance of robust biased Robust Ridge Regression (RRR), Robust Principal Component Regression (RPCR) and RSIMPLS methods are compared with each other and their classical versions known as Ridge Regression (RR), Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR) in terms of predictive ability by using trimmed Root Mean Squared Error (TRMSE) statistic in case of both of multicollinearity and outliers existence in an unemployment data set of Turkey. Analysis results show that RRR model is chosen as the best model for determining unemployment rate in Turkey for the period of 1985-2012. Robust biased RRR method showed that the most important independent variable effecting the unemployment rate is Purchasing Power Parities (PPP). The least important variables effecting the unemployment rate are Import Growth Rate (IMP) and Export Growth Rate (EXP). Hence, any increment in PPP cause an important increment in unemployment rate, however, any increment in IMP causes an unimportant increase in unemployment rate. Any increment in EXP causes an unimportant decrease in unemployment rate.
Abstract: Despite the unreasonable feature independence assumption, the naive Bayes classifier provides a simple way but competes well with more sophisticated classifiers under zero-one loss function for assigning an observation to a class given the features observed. However, it has been proved that the naive Bayes works poorly in estimation and in classification for some cases when the features are correlated. To extend, researchers had developed many approaches to free of this primary but rarely satisfied assumption in the real world for the naive Bayes. In this paper, we propose a new classifier which is also free of the independence assumption by evaluating the dependence of features through pair copulas constructed via a graphical model called D-Vine tree. This tree structure helps to decompose the multivariate dependence into many bivariate dependencies and thus makes it possible to easily and efficiently evaluate the dependence of features even for data with high dimension and large sample size. We further extend the proposed method for features with discrete-valued entries. Experimental studies show that the proposed method performs well for both continuous and discrete cases.