Abstract: We consider the Autoregressive Conditional Marked Duration (ACMD) model and apply it to 16 stocks traded in Hong Kong Stock Ex change (SEHK). By examining the orderings of appropriate sets of model parameters, market microstructure phenomena can be explained. To sub stantiate these conclusions, likelihood ratio test is used for testing the sig nificance of the parameter orderings of the ACMD model. While some of our results resolve a few controversial market microstructure hypotheses and echo some of the existing empirical evidence, we discover some interesting market microstructure phenomena that may be characteristic to SEHK.
Abstract: Believe the Positive (BP) and Believe the Negative (BN) rules for combining two continuous diagnostic tests are compared with proce dures based on likelihood ratio and linear combination of the two tests. The sensitivity-specificity relationship for BP/BN is illustrated through a graph ical presentation of a ”ROC surface”, which leads to a natural approach of choosing between BP and BN. With a bivariate normal model, it is shown that the discriminating power of this approach is higher when the correla tion between the two tests has different signs for non-diseased and diseased population, given the location and variations of the two distributions are fixed. The idea is illustrated through an example.
Abstract: Household data are frequently used in estimating vaccine efficacy because it provides information about every individual’s exposure to vaccinated and unvaccinated infected household members. This information is essential for reliable estimation of vaccine efficacy for infectiousness (V EI ), in addition to estimating vaccine efficacy for susceptibility (V ES ). However, accurate infection outcome data is not always available on each person due to high cost or lack of feasible methods to collect this information. Lack of reliable data on true infection status may result in biased or inefficient estimates of vaccine efficacy. In this paper, a semiparametric method that uses surrogate outcome data and a validation sample is introduced for estimation of V ES and V EI from a sample of households. The surrogate outcome data is usually based on illness symptoms. We report the results of simulations conducted to examine the performance of the estimates, compare the proposed semiparametric method with maximum likelihood methods that either use the validation data only or use the surrogate data only and address study design issues. The new method shows improved precision as compared to a method based on the validation sample only and smaller bias as compared to a method using surrogate outcome data only. In addition, the use of household data is shown to greatly improve the attenuation in the estimate of V ES due to misclassification of the outcome, as compared to the use of a random sample of unrelated individuals.
Abstract: The generalized gamma model has been used in several applied areas such as engineering, economics and survival analysis. We provide an extension of this model called the transmuted generalized gamma distribution, which includes as special cases some lifetime distributions. The proposed density function can be represented as a mixture of generalized gamma densities. Some mathematical properties of the new model such as the moments, generating function, mean deviations and Bonferroni and Lorenz curves are provided. We estimate the model parameters using maximum likelihood. We prove that the proposed distribution can be a competitive model in lifetime applications by means of a real data set.
This study applied partial least squares (PLS) path modeling for quantifying and identifying the determinants of job seekers’ acceptance and use of employment websites (EWs) by using an aggregate model that applied task-technology fit (TTF), consumer acceptance and use of information technology (UTAUT2). We propose that the most crucial constructs explaining EW adoption are habit, behavioral intention, performance expectancy, and facilitating conditions. This study verified that a job seeker’s habits were a major predictor of intention and usage of EWs involving web-based technology and occasional usage. Thus, when job seekers perceive that their task is to fit the technology, they recognize the value of using the technology and use it habitually.
Overdispersion is a common phenomenon in Poisson modelling. The generalized Poisson (GP) distribution accommodates both overdispersion and under dispersion in count data. In this paper, we briefly overview different overdispersed and zero-inflated regression models. To study the impact of fitting inaccurate model to data simulated from some other model, we simulate data from ZIGP distribution and fit Poisson, Generalized Poisson (GP), Zero-inflated Poisson (ZIP), Zero-inflated Generalized Poisson (ZIGP) and Zero-inflated Negative Binomial (ZINB) model. We compare the performance of the estimates of Poisson, GP, ZIP, ZIGP and ZINB through mean square error, bias and standard error when the samples are generated from ZIGP distribution. We propose estimators of parameters of ZIGP distribution based on the first two sample moments and proportion of zeros referred to as MOZE estimator and compare its performance with maximum likelihood estimate (MLE) through a simulation study. It is observed that MOZE are almost equal or even more efficient than that of MLE of the parameters of ZIGP distribution.
Abstract: As a useful alternative to the Cox proportional hazards model, the linear regression survival model assumes a linear relationship between the covariates and a known monotone transformation, for example logarithm, of an event time of interest. In this article, we study the linear regression survival model with right censored survival data, when high-dimensional microarray measurements are present. Such data may arise in studies in vestigating the statistical influence of molecular features on survival risk. We propose using the principal component regression (PCR) technique for model reduction based on the weight least squared Stute estimate. Com pared with other model reduction techniques, the PCR approach is relatively insensitive to the number of covariates and hence suitable for high dimen sional microarray data. Component selection based on the nonparametric bootstrap, and model evaluation using the time-dependent ROC (receiver operating characteristic) technique are investigated. We demonstrate the proposed approach with datasets from two microarray gene expression pro filing studies of lymphoma cancers
Abstract: Accurately understanding the distribution of sediment measurements within large water bodies such as Lake Michigan is critical for modeling and understanding of carbon, nitrogen, silica, and phosphorus dynamics. Several water quality models have been formulated and applied to the Great Lakes to investigate the fate and transport of nutrients and other constituents, as well as plankton dynamics. This paper summarizes the development of spatial statistical tools to study and assess the spatial trends of the sediment data sets, which were collected from Lake Michigan, as part of Lake Michigan Mass Balance Study. Several new spatial measurements were developed to quantify the spatial variation and continuity of sediment data sets under concern. The applications of the newly designed spatial measurements on the sediment data, in conjunction with descriptive statistics, clearly reveal the existence of the intrinsic structure of strata, which is hypothesized based on linear wave theory. Furthermore, a new concept of strata consisting of two components defined based on depth is proposed and justified. The findings presented in this paper may impact the future studies of sediment within Lake Michigan and all of the Great Lakes as well.
Abstract: Trials for comparing interventions where cluster of subjects, rather than individuals, are randomized, are commonly called cluster randomized trials (CRTs). For comparison of binary outcomes in a CRT, although there are a few published formulations for sample size computation, the most commonly used is the one developed by Donner, Birkett, and Buck (Am J Epidemiol, 1981) probably due to its incorporation in the text book by Fleiss, Levin, and Paik (Wiley, 2003). In this paper, we derive a new χ 2 approximation formula with a general continuity correction factor (c) and show that specially for the scenarios of small event rates (< 0.01), the new formulation recommends lower number of clusters than the Donner et al. formulation thereby providing better efficiency. All known formulations can be shown to be special cases at specific value of the general correction factor (e.g., Donner formulation is equivalent to the new formulation for c = 1). Statistical simulation is presented with data on comparative efficacy of the available methods identifying correction factors that are optimal for rare event rates. Table of sample size recommendation for variety of rare event rates along with code in“R” language for easy computation of sample size in other settings is also provided. Sample size calculations for a published CRT (“Pathways to Health study” that evaluates the value of intervention for smoking cessation) are computed for various correction factors to illustrate that with an optimal choice of the correction factor, the study could have maintained the same power with a 20% less sample size.