Abstract: Labor market surveys usually partition individuals into three states: employed, unemployed, and out of the labor force. In particular, the Argentine “ Encuesta Permanente de Hogares (EPH)” follows a rotating scheme so that each selected household is interviewed four times within two years. Each time, the current labor state of individuals is recorded, together with extensive demographic information. We model those labor paths as consecutive observations from independent Markov chains, were transition matrixes are related to covariates through a multivariate logistic link. Because the EPH is severely affected by attrition, a significant fraction of the surveyed paths contain just one single point. Instead of discarding those observations, we opt to base estimation on the full data by (i) assuming the Markov chains are stationary and (ii) incorporating the chronological time of the first interview as an additional covariate for each individual. This novel treatment represents a convenient approximation, which we illustrate with data from Argentina in the period 1995-2002 via maximum likelihood estimation. Several interesting labor market indexes, which are functionally related to the transition matrixes, are also presented in the last portion of the paper and illustrated with real data.
In this article a new Bayesian regression model, called the Bayesian semi-parametric logistic regression model, is introduced. This model generalizes the semi-parametric logistic regression model (SLoRM) and improves its estimation process. The paper considers Bayesian and non-Bayesian estimation and inference for the parametric and semi-parametric logistic regression model with application to credit scoring data under the square error loss function. The paper introduces a new algorithm for estimating the SLoRM parameters using Bayesian theorem in more detail. Finally, the parametric logistic regression model (PLoRM), the SLoRM and the Bayesian SLoRM are used and compared using a real data set.
Abstract: When there is a rare disease in a population, it is inefficient to take a random sample to estimate a parameter. Instead one takes a random sample of all nuclear families with the disease by ascertaining at least one affected sibling (proband) of each family. In these studies, an estimate of the proportion of siblings with the disease will be inflated. For example, studies of the issue of whether a rare disease shows an autosomal recessive pattern of inheritance, where the Mendelian segregation ratios are of interest, have been investigated for several decades. How do we correct for this ascertainment bias? Methods, primarily based on maximum likelihood estimation, are available to correct for the ascertainment bias. We show that for ascertainment bias, although maximum likelihood estimation is optimal under asymptotic theory, it can perform badly. The problem is exasperated in the situation where the proband probabilities are allowed to vary with the number of affected siblings. We use two data sets to illustrate the difficulties of maximum likelihood estimation procedure, and we use a simulation study to assess the quality of the maximum likelihood estimators.
Abstract: Anti-smoking media campaign is an effective tobacco control strategy. How to identify what types of advertising messages are effective is important for maximizing the use of limited funding sources for such campaigns. In this paper, we propose a statistical modeling approach for systematically assessing the effectiveness of anti-smoking media campaigns based on ad recall rates and rating scores. This research is motivated by the need for evaluating youth responses to the Massachusetts Tobacco Control Program (MTCP) media campaign. Pattern-mixture GEE models are pro posed to evaluate the impact of viewer and ads characteristics on ad recall rates and rating scores controlling for missing values, confounding and cor relations in the data. A key difficulty for pattern-mixture modeling is that there were too many distinct missing data patterns which cause convergence problem for modeling fitting based on limited data. A heuristic argument based on collapsing missing data patterns is used to test the missing com pletely at random (MCAR) assumption in pattern-mixture GEE models. The proposed modeling approach and the recall-rating study design pro vide a complete system for identifying the most effective type of advertising messages.
Abstract: : In this paper, we discussed classical and Bayes estimation procedures for estimating the unknown parameters as well as the reliability and hazard functions of the flexible Weibull distribution when observed data are collected under progressively Type-II censoring scheme. The performances of the maximum likelihood and Bayes estimators are compared in terms of their mean squared errors through the simulation study. For the computation of Bayes estimates, we proposed the use of Lindley’s approximation and Markov Chain Monte Carlo (MCMC) techniques since the posteriors of the parameters are not analytically tractable. Further, we also derived the one and two sample posterior predictive densities of future samples and obtained the predictive bounds for future observations using MCMC techniques. To illustrate the discussed procedures, a set of real data is analysed.
Abstract: Female baboons, some with infants, were observed and counts made of interactions in which females interacted with the infants of other females (so-called infant-handling). Independent of these observations, each baboon is assigned a dominance rank of “low,” “medium,”or “high.” Researchers hypothesized that females tend to handle infants of females ranked below them. The data form an array with row-labels being infant labels and columns being female labels. Entry (i, j) counts total infant handlings of infant i by female j. Each count corresponds to one of 9 combinations of female by infant/mother ranks, which induces a 3-by-3 table of total interactions. We use a permutation test to support the research hypothesis, where ranks are permuted at random. We also discuss statistical properties of our method such as choice of test statistic, power, and stability of results to individual observations. We discover that the data support a nuanced view of baboon interaction, where higher-ranked females prefer to handle down the hierarchy, while lower-ranked females must balance the desire to accede to the desires of the high-ranked females while protecting their infants from the potential risks involved in such interactions.
Abstract: A spatio-temporal statistical model for Chronic Wasting Disease is presented. The model has underpinnings from traditional epidemic models with differential equations and uses a Bayesian hierarchy to directly incorporate existing prevalence data. Spatial dynamics are modeled explicitly through a system of difference equations rather than through covariance. The posterior distribution gives evidence of a long term stable level of disease prevalence, and approximates the probability of the movement of the disease from one area to another. Predictions for the future of Chronic Wasting Disease in Colorado are given. The model is used to formulate efficient sampling schemes for future data collection.
Abstract: This paper uses a structural time series methodology to test the notion of interconnectedness between the UK and the US credit markets. The empirical tests utilise data on premium for the Banking sector credit default swaps (CDS) and covers the recent period of financial turmoil. The methodology based on Kalman filter is robust in the presence of limited convergence. The long-term steady state convergence in CDS premium is clearly noticeable between these two markets from the results. This observation lends support for the coordinated regulatory policy initiatives to deal with the crisis and offer suggestions for sound operations of the international financial systems.
Abstract: In the case-parents trio design for testing candidate-gene association, the distribution of the data under the null hypothesis of no association is completely known. Therefore, the exact null distribution of any test statistic can be simulated by using Monte-Carlo method. In the literature, several robust tests have been proposed for testing the association in the case-parents trio design when the genetic model is unknown, but all these tests are based on the asymptotic null distributions of the test statistics. In this article, we promote the exact robust tests using Monte-Carlo simulations. It is because: (i) the asymptotic tests are not accurate in terms of the probability of type I error when sample size is small or moderate; (ii) asymptotic theory is not available for certain good candidates of test statistics. We examined the validity of the asymptotic distributions of some of the test statistics studied in the literature and found that in certain cases the probability of type I error is greatly inflated in the asymptotic tests. In this article, we also propose new robust test statistics which are statistically more reasonable but without asymptotic theory available. The powers of these robust statistics are compared with those of the existent statistics in the literature through a simulation study. It is found that these robust statistics are preferable to the others in terms of their efficiency and robustness.