Abstract: Since late thirties, factorial analysis of a response measured on the real line has been well established and documented in the literature. No such analysis, however, is available for a response measured on the circle (or sphere in general), despite the fact that many designed experiments in industry, medicine, psychology and biology could result in an angular response. In this paper a full factorial analysis is presented for a circular response using the Spherical Projected Multivariate Linear model. Main and interaction effects are defined, estimated and tested. Analogy to the linear response case, two new effect plots: Circular-Main Effect and Circular Interaction Effect plots are proposed to visualize main and interaction effects on circular responses.
Abstract: This paper investigates the return, volatility, and trading on the Shanghai Stock Exchange with high-frequency intraday five-minute Shanghai Stock Exchange Composite Index (SHCI) data. The random walk hypothesis is rejected, indicating there are predictable components in the index. We adopt a time-inhomogeneous diffusion model using log penalized splines (log P-splines) to estimate the volatility. A GARCH volatility model is also fitted for comparison. A de-volatilized series are obtained by using the de-volatilization technique of Zhou (1991) that resample the data into different de-volatilized series with more desired properties for trading. A trading program based on local trends extracted with a State Space model is then implemented on the de-volatilized five-minute SHCI return series for profit. Volatility estimates from both models are found to be competitive for the purpose of trading.
Abstract: It is believed that overdispersion or extravariation as often re ferred is present more in survey data due to the existence of heterogeneity among and between the units. One approach to address such a phenomenon is to use a generalized Dirichlet-multinomial model. In its application the generalized Dirichlet-multinomial model assumes that the clusters are of equal sizes and the number of clusters remains the same from time to time. In practice this may rarely ever be the case when clusters are observed over time. In this paper the random variability and the varying response rates are accounted for in the model. This requires modeling another level of variation. In effect, this can be considered a hierarchical model that allows varying response rates in the presence of overdispersed multinomial data. The model and its applicability are demonstrated through an illustrative application to a subset of the well known High School and Beyond survey data.
Abstract: We investigate whether the posterior predictive p-value can detect unknown hierarchical structure. We select several common discrepancy measures (i.e., mean, median, standard deviation, and χ2 goodness-of-fit) whose choice is not motivated by knowledge of the hierarchical structure. We show that if we use the entire data set these discrepancy measures do not detect hierarchical structure. However, if we make use of the subpopulation structure many of these discrepancy measures are effective. The use of this technique is illustrated by studying the case where the data come from a two-stage hierarchical regression model while the fitted model does not include this feature.
We develop a health informatics toolbox that enables timely analysis and evaluation of the timecourse dynamics of a range of infectious disease epidemics. As a case study, we examine the novel coronavirus (COVID-19) epidemic using the publicly available data from the China CDC. This toolbox is built upon a hierarchical epidemiological model in which two observed time series of daily proportions of infected and removed cases are generated from the underlying infection dynamics governed by a Markov Susceptible-Infectious-Removed (SIR) infectious disease process. We extend the SIR model to incorporate various types of time-varying quarantine protocols, including government-level ‘macro’ isolation policies and community-level ‘micro’ social distancing (e.g. self-isolation and self-quarantine) measures. We develop a calibration procedure for underreported infected cases. This toolbox provides forecasts, in both online and offline forms, as well as simulating the overall dynamics of the epidemic. An R software package is made available for the public, and examples on the use of this software are illustrated. Some possible extensions of our novel epidemiological models are discussed.
The generalized gamma model has been used in several applied areas such as engineering, economics and survival analysis. We provide an extension of this model called the transmuted generalized gamma distribution, which includes as special cases some lifetime distributions. The proposed density function can be represented as a mixture of generalized gamma densities. Some mathematical properties of the new model such as the moments, generating function, mean deviations and Bonferroni and Lorenz curves are provided. We estimate the model parameters using maximum likelihood. We prove that the proposed distribution can be a competitive model in lifetime applications by means of a real data set.
Abstract: Different models are used in practice for describing a binary lon gitudinal data. In this paper we consider the joint probability models, the marginal models, and the combined models for describing such data the best. The combined model consists of a joint probability model and a marginal model at two different levels. We present some striking empirical observa tions on the closeness of the estimates and their standard errors for some parameters of the models considered in describing a data from Fitzmaurice and Laird (1993) and consequently giving new insight from this data. We present the data in a complete factorial arrangement with 4 factors at 2 levels. We introduce the concept of “data representing a model completely” and explain “data balance” as well as “chance balance”. We also consider the best model selection problem for describing this data and use the Search Linear Model concepts known in Fractional Factorial Design research (Sri vastava (1975)).
This article discusses the estimation of the Generalized Power Weibull parameters using the maximum product spacing (MPS) method, the maximum likelihood (ML) method and Bayesian estimation method under squares error for loss function. The estimation is done under progressive type-II censored samples and a comparative study among the three methods is made using Monte Carlo Simulation. Markov chain Monte Carlo (MCMC) method has been employed to compute the Bayes estimators of the Generalized Power Weibull distribution. The optimal censoring scheme has been suggested using two different optimality criteria (mean squared of error, Bias and relative efficiency). A real data is used to study the performance of the estimation process under this optimal scheme in practice for illustrative purposes. Finally, we discuss a method of obtaining the optimal censoring scheme.