Abstract: In this paper we propose a new three-parameters lifetime distribu tion with decreasing hazard function, the long-term exponential geometric distribution. The new distribution arises on latent competing risks scenarios, where the lifetime associated with a particular risk is not observable, rather we observe only the minimum lifetime value among all risks, and there is presence of long-term survival. The properties of the proposed distribution are discussed, including its probability density function and explicit algebraic formulas for its survival and hazard functions, order statistics, Bonferroni function and the Lorenz curve. The parameter estimation is based on the usual maximum likelihood approach. We compare the new distribution with its particular case, the long-term exponential distribution, as well as with the long-term Weibull distribution on two real datasets, observing its poten tial and competitiveness in comparison with an usual lifetime distribu
In this paper, we considered a new generalization of the paralogistic distribution which we called the three-parameter paralogistic distribution. Some properties of the new distribution which includes the survival function, hazard function, quantile function, moments, Renyi entropy and the maximum likelihood estimation (MLE) of its parameters are obtained. A simulation study shows that the MLE of the parameters of the new distribution is consistent and asymptotically unbiased. An applicability of the new three-parameter paralogistic distribution was subject to a real lifetime data set alongside with some related existing distributions such as the Paralogistic, Gamma, Transformed Beta, Log-logistic and Inverse paralogistic distributions. The results obtained show that the new three-parameter paralogistic distribution was superior to other aforementioned distributions in terms of the Akaike information criterion (AIC) and K-S Statistic values. This claim was further supported by investigating the density plots, P-P plots and Q-Q plots of the distributions for the data set under study.
A new four-parameter model called the Marshall-Olkin extended generalized Gompertz distribution is introduced. Its hazard rate function can be constant, increasing, decreasing, upside-down bathtub or bathtub-shaped depending on its parameters. Some mathematical properties of this model such as expansion for the density function, moments, moment generating function, quantile function, mean deviations, mean residual life, order statistics and Rényi entropy are derived. The maximum likelihood technique is used to estimate the unknown model parameters and the observed information matrix is determined. The applicability of the proposed model is shown by means of a real data set.
Abstract: In recent years Singular Spectrum Analysis (SSA), used as a powerful technique in time series analysis, has been developed and applied to many practical problems. In this paper, the performance of the SSA tech nique has been considered by applying it to a well-known time series data set, namely, monthly accidental deaths in the USA. The results are com pared with those obtained using Box-Jenkins SARIMA models, the ARAR algorithm and the Holt-Winter algorithm (as described in Brockwell and Davis (2002)). The results show that the SSA technique gives a much more accurate forecast than the other methods indicated above.
Subsampling the data is used in this paper as a learning method about the influence of the data points for drawing inference on the parameters of a fitted logistic regression model. The alternative, alternative regularized, alternative regularized lasso, and alternative regularized ridge estimators are proposed for the parameter estimation of logistic regression models and are then compared with the maximum likelihood estimators. The proposed alternative regularized estimators are obtained by using a tuning parameter but the proposed alternative estimators are not regularized. The proposed alternative regularized lasso estimators are the averaged standard lasso estimators and the alternative regularized ridge estimators are also the averaged standard ridge estimators over subsets of groups where the number of subsets could be smaller than the number of parameters. The values of the tuning parameters are obtained to make the alternative regularized estimators very close to the maximum likelihood estimators and the process is explained with two real data as well as a simulated study. The alternative and alternative regularized estimators always have the closed form expressions in terms of observations that the maximum likelihood estimators do not have. When the maximum likelihood estimators do not have the closed form expressions, the alternative regularized estimators thus obtained provide the approximate closed form expressions for them.
Abstract: In any sport competition, there is a strong interest in knowing which team shall be the champion at the end of the championship. Besides this, the end result of a match, the chance of a team to be qualified for a specific tournament, the chance of being relegated, the best attack, the best defense, among others, are also subject of interest. In this paper we present a simple method with good predictive quality, easy implementation, low computational effort, which allows the calculation of all the interesting quantities above. Following Lee (1997), we estimate the average goals scored by each team by assuming that the number of goals scored by a team in a match follows a univariate Poisson distribution but we consider linear models that express the sum and the difference of goals scored in terms of five covariates: the goal average in a match, the home-team advantage, the team’s offensive power, the opponent team’s defensive power and a crisis indicator. The methodology is applied to the 2008-2009 English Premier League.
Abstract: Information regarding small area prevalence of chronic disease is important for public health strategy and resourcing equity. This paper develops a prevalence model taking account of survey and census data to derive small area prevalence estimates for diabetes. The application involves 32000 small area subdivisions (zip code census tracts) of the US, with the prevalence estimates taking account of information from the US-wide Behavioral Risk Factor Surveillance System (BRFSS) survey on population prevalence differentials by age, gender, ethnic group and education. The effects of such aspects of population composition on prevalence are widely recognized. However, the model also incorporates spatial or contextual influences via spatially structured effects for each US state; such contextual effects are allowed to differ between ethnic groups and other demographic categories using a multivariate spatial prior. A Bayesian estimation approach is used and analysis demonstrates the considerably improved fit of a fully specified compositional-contextual model as compared to simpler ‘standard’ approaches which are typically limited to age and area effects.
Abstract: This note underscores important considerations that should be taken into account when teaching students to check for inadequacies of a given linear, nonlinear or logistic regression models. Key illustrations are provided which underscore the shortcomings of currently used procedures. A brief overview of nonlinear regression models is given in order to lay the foundation for testing for lack of fit in nonlinear models. This paper also introduces a new ’scaled’ binary logistic regression model to highlight po tential problems with the usual logistic model, and implications for choosing a robust optimal experimental design are also underscored and discussed. Key words: Lack of fit, logistic regression, nonlinear regression, optimal de
Abstract: In this paper we analyze the weight loss behaviour of Mexican garlic under different storage conditions. Garlic is an important Mexican export product. Quality losses during storage are important to understand due to cost and sale opportunity implications. Weight losses profiles for each experimental conditions, represented as functions, are modeled by means of functional linear models and hypotheses tests are performed to compare treatments. Monte Carlo sampling version of permutation tests are used to obtain p-values. Using the functional approach clearly defined storage regimes that significantly decrease the speed of deterioration of the product relative to traditional Mexican agricultural practices.
Abstract: There has been great interest in the Southern Illinois mine war by historians. An explanation has been that this war was caused by miners who had radical political beliefs. We examine this view by applying four methods of ecological inference to estimate the proportion of coal miners who were socialist voters in this time period. Based on these results (especially considering the assumptions of the methods) we conclude that miners were politically less radical than previously thought.