Abstract: Kang (2006) used the log-likelihood function with Lagrangian multipliers for estimation of cell probabilities in two-way incomplete contingency tables. The constraints on cell probabilities can be incorporated through Lagrangian multipliers for the likelihood function. The method can be readily extended to multidimensional tables. Variances of the MLEs are derived from the matrix of second derivatives of the log likelihood with respect to cell probabilities and the Lagrange multiplier. Wald and likelihood ratio tests of independence are derived using the estimates and estimated variances. Simulation results, when data are missing at random, reveal that maximum likelihood estimation (MLE) produces more efficient estimates of population proportions than either multiple imputation (MI) based on data augmentation or complete case (CC) analysis. Neither MLE nor MI, however, leads to an improvement over CC analysis with respect to power of tests for independence in 2×2 tables. Thus, the partially classified marginal information increases precision about proportions, but is not helpful for judging independence.
Abstract: True-value theory (Bechtel, 2010), as an extension of randomization theory, allows arbitrary measurement errors to pervade a survey score as well as its predictor scores. This implies that true scores need not be expectations of observed scores and that expected errors need not be zero within a respondent. Rather, weaker assumptions about measurement errors over respondents enable the regression of true scores on true predictor scores. The present paper incorporates Sarndal-Lundstrom (2005) weight calibration into true-value regression. This correction for non-response is illustrated with data from the fourth round of the European Social Survey (ESS). The results show that a true-value regression coefficient can be corrected even with a severely unrepresentative sample. They also demonstrate that this regression slope is attenuated more by measurement error than by non-response. Substantively, this ESS analysis establishes economic anxiety as an important predictor of life quality in the financially stressful year of 2008.
Abstract: This paper provides an introduction to multivariate non-parametric hazard model for the occurrence of earthquakes since the hazard function defines the statistical distribution of inter-event times. The method is ap plied to the Turkish seismicity since a significant portion of Turkey is subject to frequent earthquakes and presents several advantages compared to other more traditional approaches. Destructive earthquakes from 1903 to 2009 between the longitudes of (39-42)N◦ and the latitudes of (26-45)E◦ are used. The paper demonstrates how seismicity and tectonics/physics parameters that can potentially influence the spatio-temporal variability of earthquakes and presents several advantages compared to more traditional approaches.
Abstract: We derive three likelihood-based confidence intervals for the risk ratio of two proportion parameters using a double sampling scheme for mis classified binomial data. The risk ratio is also known as the relative risk. We obtain closed-form maximum likelihood estimators of the model parameters by maximizing the full-likelihood function. Moreover, we develop three confidence intervals: a naive Wald interval, a modified Wald interval, and a Fieller-type interval. We apply the three confidence intervals to cervical cancer data. Finally, we perform two Monte Carlo simulation studies to assess and compare the coverage probabilities and average lengths of the three interval estimators. Unlike the other two interval estimators, the modified Wald interval always produces close-to-nominal confidence intervals for the various simulation scenarios examined here. Hence, the modified Wald confidence interval is preferred in practice.
Abstract: This paper discusses the selection of the smoothing parameter necessary to implement a penalized regression using a nonconcave penalty function. The proposed method can be derived from a Bayesian viewpoint, and the resultant smoothing parameter is guaranteed to satisfy the sufficient conditions for the oracle properties of a one-step estimator. The results of simulation and application to some real data sets reveal that our proposal works efficiently, especially for discrete outputs.
Abstract: Using financial ratio data from 2006 and 2007, this study uses a three-fold cross validation scheme to compare the classification and pre diction of bankrupt firms by robust logistic regression with the Bianco and Yohai (BY) estimator versus maximum likelihood (ML) logistic regression. With both the 2006 and 2007 data, BY robust logistic regression improves both the classification of bankrupt firms in the training set and the prediction of bankrupt firms in the testing set. In an out of sample test, the BY robust logistic regression correctly predicts bankruptcy for Lehman Brothers; however, the ML logistic regression never predicts bankruptcy for Lehman Brothers with either the 2006 or 2007 data. Our analysis indicates that if the BY robust logistic regression significantly changes the estimated regression coefficients from ML logistic regression, then the BY robust logistic regression method can significantly improve the classification and prediction of bankrupt firms. At worst, the BY robust logistic regression makes no changes in the estimated regression coefficients and has the same classification and prediction results as ML logistic regression. This is strong evidence that BY robust logistic regression should be used as a robustness check on ML logistic regression, and if a difference exists, then BY robust logistic regression should be used as the primary classifier.
Abstract: Identification of representative regimes of wave height and direction under different wind conditions is complicated by issues that relate to the specification of the joint distribution of variables that are defined on linear and circular supports and the occurrence of missing values. We take a latent-class approach and jointly model wave and wind data by a finite mixture of conditionally independent Gamma and von Mises distributions. Maximum-likelihood estimates of parameters are obtained by exploiting a suitable EM algorithm that allows for missing data. The proposed model is validated on hourly marine data obtained from a buoy and two tide gauges in the Adriatic Sea.
Abstract: In the face of global uncertainty and a growing reliance on third party indices to gain a snapshot of a country’s operations, accurate decision making makes or breaks relationships in global trade. Under this aegis, we question the validity of traditional logistic regression using the maximum likelihood estimator (MLE) in classifying countries for doing business. This paper proposes that a weighted version of the Bianco and Yohai (BY) estimator is a superlative and robust (outlier resistant) tool in the hands of practitioners to gauge the correct antecedents of a country’s internal environment and decide whether to do or not do business with that country. In addition, this robust process is effective in differentiating between “problem” countries and “safe” countries for doing business. An existing “R” program for the BY estimation technique by Croux and Haesbroeck has been modified to fit our cause.
Abstract: In this paper we consider clinical trials with two treatments and a non-normally distributed response variable. In addition, we focus on ap plications which include only discrete covariates and their interactions. For such applications, the semi-parametric Area Under the ROC Curve (AUC) regression model proposed by Dodd and Pepe (2003) can be used. However, because a logistic regression procedure is used to obtain parameter estimates and a bootstrapping method is needed for computing parameter standard errors, their method may be cumbersome to implement. In this paper we propose to use a set of AUC estimates to obtain parameter estimates and combine DeLong’s method and the delta method for computing parameter standard errors. Our new method avoids heavy computation associated with the Dodd and Pepe’s method and hence is easy to implement. We conduct simulation studies to show that the two methods yield similar results. Finally, we illustrate our new method using data from urinary incontinence clinical trials.
Abstract: This paper reviews zero-inflated count models and applies them to modelling annual trends in incidences of occupational allergic asthma, dermatitis and rhinitis in France. Based on the data collected from 2001 to 2009, the study uses the incidence rate ratios (IRR) as percentage of changes in incidences and plots them as function of the years to obtain trends. The investigation reveals that the trend is decreasing for asthma and rhinitis, and increasing for dermatitis, and that there is a possible positive association between the three diseases.