Abstract: This paper considers models of educational data where a value added analysis is required. These models are multilevel in nature and contain endogenous regressors. Multivariate models are considered so as to simulta neously model results from different subject areas. Path models and factor models are considered as types of model that can be used to overcome the problem of endogeneity. Estimation methods available in MLwiN and EQS are used. The use of a factor model with EQS is shown to give estimates of the effects of teaching styles that have smaller standard errors than any other method studied.
Abstract: Frailty models have become popular in survival analysis for deal ing with situations where groups of observations are correlated. If the data comprise only exact or right-censored failure times, inference can be done by either integrating out the frailties directly or by using the EM algorithm. If there is both left- and right-censoring this is no longer the case. How ever the MCMC method of Clayton (1991, Biometrics 47, 467-485) can be easily extended by imputation of the left-censored times. Several schemes for doing this are suggested and compared. Application of the methods is illustrated using data on the joint failures of patients with fibrodysplasia ossificans progressiva.
Time series modelling is very popular technique used in data science. Main motive of time series modelling is to know the data generating process and also get its parameters which depend on all the observations. There may be few observations which misinterpret the data and also influence the parameters, such type of observations are called Outlier. The present study dealt the handling of outlier in context of ARIMA time series and proposed an alternative approach for the replacement of outlier. In usual process two ways of handling the outlier is popular, in first remove the outliers from the data and second replace it by the nearby values. Removal concept cannot work in the auto-correlated data like time series and similarly replacement of outlier through just previous/after value is also not much appropriate method because of dependency structure. Therefore, we are proposing an alternative approach, in which outlier is replaced by estimated values through best model. Detailed methodology is discussed and then an empirical analysis on the time series of National Pension Scheme (NPS) is carried out. Most of the series are modelled perfectly and few series were not due to non-stationary nature of the series. After getting an outlier free series, forecasting is also done. The realization of the series also performed on proposed methodology to get generalized view of proposed methodology and get similar result.
Abstract: The aim of this paper is to represent the Bonus-Malus System (BMS) of Iran, which is a mandatory scheme based on Insurance act num ber 56. We examine the current Iranian BMS, using various criteria such as elasticity and time of convergence to steady state with respect to the claim frequency as well as financial balance. We also find the closed form of stationary distribution of the Iranian BMS that plays a key role in study of BMSs. Moreover, we compare the results with the German and Japan BMS. Finally we give some hints that can be used to improve the performance of the current Iranian BMS.
Abstract: Labor market surveys usually partition individuals into three states: employed, unemployed, and out of the labor force. In particular, the Argentine “ Encuesta Permanente de Hogares (EPH)” follows a rotating scheme so that each selected household is interviewed four times within two years. Each time, the current labor state of individuals is recorded, together with extensive demographic information. We model those labor paths as consecutive observations from independent Markov chains, were transition matrixes are related to covariates through a multivariate logistic link. Because the EPH is severely affected by attrition, a significant fraction of the surveyed paths contain just one single point. Instead of discarding those observations, we opt to base estimation on the full data by (i) assuming the Markov chains are stationary and (ii) incorporating the chronological time of the first interview as an additional covariate for each individual. This novel treatment represents a convenient approximation, which we illustrate with data from Argentina in the period 1995-2002 via maximum likelihood estimation. Several interesting labor market indexes, which are functionally related to the transition matrixes, are also presented in the last portion of the paper and illustrated with real data.
In this article a new Bayesian regression model, called the Bayesian semi-parametric logistic regression model, is introduced. This model generalizes the semi-parametric logistic regression model (SLoRM) and improves its estimation process. The paper considers Bayesian and non-Bayesian estimation and inference for the parametric and semi-parametric logistic regression model with application to credit scoring data under the square error loss function. The paper introduces a new algorithm for estimating the SLoRM parameters using Bayesian theorem in more detail. Finally, the parametric logistic regression model (PLoRM), the SLoRM and the Bayesian SLoRM are used and compared using a real data set.
Abstract: When there is a rare disease in a population, it is inefficient to take a random sample to estimate a parameter. Instead one takes a random sample of all nuclear families with the disease by ascertaining at least one affected sibling (proband) of each family. In these studies, an estimate of the proportion of siblings with the disease will be inflated. For example, studies of the issue of whether a rare disease shows an autosomal recessive pattern of inheritance, where the Mendelian segregation ratios are of interest, have been investigated for several decades. How do we correct for this ascertainment bias? Methods, primarily based on maximum likelihood estimation, are available to correct for the ascertainment bias. We show that for ascertainment bias, although maximum likelihood estimation is optimal under asymptotic theory, it can perform badly. The problem is exasperated in the situation where the proband probabilities are allowed to vary with the number of affected siblings. We use two data sets to illustrate the difficulties of maximum likelihood estimation procedure, and we use a simulation study to assess the quality of the maximum likelihood estimators.
Abstract: Anti-smoking media campaign is an effective tobacco control strategy. How to identify what types of advertising messages are effective is important for maximizing the use of limited funding sources for such campaigns. In this paper, we propose a statistical modeling approach for systematically assessing the effectiveness of anti-smoking media campaigns based on ad recall rates and rating scores. This research is motivated by the need for evaluating youth responses to the Massachusetts Tobacco Control Program (MTCP) media campaign. Pattern-mixture GEE models are pro posed to evaluate the impact of viewer and ads characteristics on ad recall rates and rating scores controlling for missing values, confounding and cor relations in the data. A key difficulty for pattern-mixture modeling is that there were too many distinct missing data patterns which cause convergence problem for modeling fitting based on limited data. A heuristic argument based on collapsing missing data patterns is used to test the missing com pletely at random (MCAR) assumption in pattern-mixture GEE models. The proposed modeling approach and the recall-rating study design pro vide a complete system for identifying the most effective type of advertising messages.
Abstract: : In this paper, we discussed classical and Bayes estimation procedures for estimating the unknown parameters as well as the reliability and hazard functions of the flexible Weibull distribution when observed data are collected under progressively Type-II censoring scheme. The performances of the maximum likelihood and Bayes estimators are compared in terms of their mean squared errors through the simulation study. For the computation of Bayes estimates, we proposed the use of Lindley’s approximation and Markov Chain Monte Carlo (MCMC) techniques since the posteriors of the parameters are not analytically tractable. Further, we also derived the one and two sample posterior predictive densities of future samples and obtained the predictive bounds for future observations using MCMC techniques. To illustrate the discussed procedures, a set of real data is analysed.