Abstract: Here we develop methods for applications where random change points are known to be present a priori and the interest lies in their estimation and investigating risk factors that influence them. A simple least square method estimating each individual’s change point based on one’s own observations is first proposed. An easy-to-compute empirical Bayes type shrinkage is then proposed to pool information from separately estimated change points. A method to improve the empirical Bayes estimates is developed. Simulations are conducted to compare least-square estimates and Bayes shrinkage estimates. The proposed methods are applied to the Berkeley Growth Study data to estimate the transition age of the puberty height growth.
Abstract: Mixed effects models are often used for estimating fixed effects and variance components in continuous longitudinal outcomes. An EM based estimation approach for mixed effects models when the outcomes are truncated was proposed by Hughes (1999). We consider the situation when the longitudinal outcomes are also subject to non-ignorable missing in addition to truncation. A shared random effect parameter model is presented where the missing data mechanism depends on the random effects used to model the longitudinal outcomes. Data from the Indianapolis-Ibadan dementia project is used to illustrate the proposed approach
Abstract: Labor market surveys usually partition individuals into three states: employed, unemployed, and out of the labor force. In particular, the Argentine “ Encuesta Permanente de Hogares (EPH)” follows a rotating scheme so that each selected household is interviewed four times within two years. Each time, the current labor state of individuals is recorded, together with extensive demographic information. We model those labor paths as consecutive observations from independent Markov chains, were transition matrixes are related to covariates through a multivariate logistic link. Because the EPH is severely affected by attrition, a significant fraction of the surveyed paths contain just one single point. Instead of discarding those observations, we opt to base estimation on the full data by (i) assuming the Markov chains are stationary and (ii) incorporating the chronological time of the first interview as an additional covariate for each individual. This novel treatment represents a convenient approximation, which we illustrate with data from Argentina in the period 1995-2002 via maximum likelihood estimation. Several interesting labor market indexes, which are functionally related to the transition matrixes, are also presented in the last portion of the paper and illustrated with real data.
Abstract: An analysis of air quality data is provided for the municipal area of Taranto characterized by high environmental risks, due to the massive presence of industrial sites with elevated environmental impact activities. The present study is focused on particulate matter as measured by PM10 concentrations. Preliminary analysis involved addressing several data problems, mainly: (i) an imputation techniques were considered to cope with the large number of missing data, due to both different working periods for groups of monitoring stations and occasional malfunction of PM10 sensors; (ii) due to the use of different validation techniques for each of the three monitoring networks, a calibration procedure was devised to allow for data comparability. Missing data imputation and calibration were addressed by three alternative procedures sharing a leave-one-out type mechanism and based on ad hoc exploratory tools and on the recursive Bayesian estimation and prediction of spatial linear mixed effects models. The three procedures are introduced by motivating issues and compared in terms of performance.
Abstract: Graphical procedures can be useful for illustrating and evaluating the process of inverse regression. We first review some simple and well-known graphical approaches for univariate linear and nonlinear models. We then propose a new graphical tool applicable to situations where the response is bivariate and repeated measures data are available. The proposed method is illustrated with an example of the age determination of tern chicks using measurements on body weight and wing length.
Abstract: State lotteries employ sales projections to determine appropri ate advertised jackpot levels for some of their games. This paper focuses on prediction of sales for the Lotto Texas game of the Texas Lottery. A novel prediction method is developed in this setting that utilizes functional data analysis concepts in conjunction with a Bayesian paradigm to produce predictions and associated precision assessments.
Abstract: Current trends in Northern Hemisphere and Central England temperatures are estimated using a variety of statistical signal extraction and filtering techniques and their extrapolations are compared with the pre dictions from coupled atmospheric-ocean general circulation models. Ear lier warming trend epochs are also analysed and compared with the current warming trend, suggesting that the long-run patterns of temperature trends should also be considered alongside the current emphasis on global warming.
Abstract: Abstract: Partial least squares (PLS) method has been designed for handling two common problems in the data that are encountered in most of the applied sciences including the neuroimaging data: 1) Collinearity problem among explanatory variables (X) or among dependent variables (Y); 2) Small number of observations with large number of explanatory variables. The idea behind this method is to explain as much as possible covariance between two blocks of X and Y variables by a small number of uncorrelated variables. Apart from the other applied sciences in which PLS are used, in the application of imaging data PLS has been used to identify task dependent changes in activity, changes in the relations between brain and behavior, and to examine functional connectivity of one or more brain regions. The aim of this paper is to give some information about PLS and apply on electroencephalography (EEG) data to identify stimulation dependent changes in EEG activity.
Abstract: A statistical evaluation of the Baltimore County water well database is performed to gain insight on the sustainability of domestic supply wells in crystalline bedrock aquifers over the last 15 years. Variables potentially related to well yield that are considered included well construction, geol ogy, well depth, and static water level. A variety of statistical methods are utilized to assess correlation and significance from a database of approxi mately 8,500 wells, and a logistic regression model is developed to predict the probability of well failure by geology type. Results of a two-way analysis of variance technique indicate that the average well depth and yield are sta tistically different among the established geology groups, and between failed and non-failed wells. The static water level is shown to be statistically dif ferent among the geology groups but not among failed and non-failed wells. A logistic regression model results that well yield is the most influential vari able for predicting well failure. Static water level and well depth was not found to be significant in predicting well failure.
Abstract: Controlled experiments give researchers a statistical tool for determining the yield from subjecting an experimental unit to various treat ments. We will discuss a replicated, block design applied to the experimental unit yeast. We subjected the yeast to six treatments. The purpose of the experiment is to extract a compound to be used in the manufacturing in dustry. We considered an ANOVA and a MANOVA model to analyze the data. The rationale for selecting one model over the other will be discussed. Results and recommendations of which treatments to use when processing the yeast will be presented, also.