Abstract: Childhood obesity is a major health concern. The associated health risks dramatically reduce lifespan and increase healthcare costs. The goal was to develop methodology to identify as early in life as possible whether or not a child would become obese at age five. This diagnostic tool would facilitate clinical monitoring to prevent and or minimize obesity. Obesity is measured by Body Mass Index (BMI), but an improved metric, the ratio of weight to height (or length) (WOH), is proposed from this re search for detecting early obesity. Results of this research demonstrate that WOH performs better than BMI for early detection of obesity in individuals using a longitudinal decision analysis (LDA), which is essentially an indi viduals type control chart analysis about a trend line. Utilizing LDA, the odds of obesity of a child at age five is indicated before the second birth day with 95% sensitivity and 97% specificity. Further, obesity at age five is indicated with 75% specificity before two months and with 84% specificity before three months of age. These results warrant expanding this study to larger cohorts of normal, overweight, and obese children at age five from different healthcare facilities to test the applicability of this novel diagnostic tool.
Abstract: The Weibull distribution has received much interest in reliability theory. The well-known maximum likelihood estimators (MLE) of this fam ily are not available in closed form expression. In this work, we propose a consistent and closed form estimator for shape parameter of three-parameter Weibull distribution. Apart from high degree of performance, the derived estimator is location and scale-invariant.
Abstract: Since late thirties, factorial analysis of a response measured on the real line has been well established and documented in the literature. No such analysis, however, is available for a response measured on the circle (or sphere in general), despite the fact that many designed experiments in industry, medicine, psychology and biology could result in an angular response. In this paper a full factorial analysis is presented for a circular response using the Spherical Projected Multivariate Linear model. Main and interaction effects are defined, estimated and tested. Analogy to the linear response case, two new effect plots: Circular-Main Effect and Circular Interaction Effect plots are proposed to visualize main and interaction effects on circular responses.
Abstract: Although many scoring models have been developed in literature to offer financial institutions guidance in credit granting decision, the pur pose of most scoring models are to improve their discrimination ability, not their explanatory ability. Therefore, the conventional scoring models can only provide limited information in the relationship among customer de mographics, default risk, and credit card attributes, such as APR (annual percentage rate) and credit limits. In this paper, a Bayesian behavior scor ing model is proposed to help financial institutions identify factors which truly reflect customer value and can affect default risk. To illustrate the proposed model, we applied it to the credit cardholder database provided by one major bank in Taiwan. The empirical results show that increasing APR will raise the default probability greatly. Single cardholders are less accountable for credit card repayment. High income, female, or cardholders with higher education are more likely to have good repayment ability.
Abstract: Mosaic plots are state-of-the-art graphics for multivariate categor ical data in statistical visualization. Knowledge structures are mathematical models that belong to the theory of knowledge spaces in psychometrics. This paper presents an application of mosaic plots to psychometric data arising from underlying knowledge structure models. In simulation trials and with empirical data, the scope of this graphing method in knowledge space theory is investigated.
Abstract: The use of contingency tables is widespread in archaeology. Cross tabulations are used in many different studies as a useful tool to syntheti cally report data, and are also useful when analyst wishes to seek for latent data structures. The latter case is when Correspondence Analysis (CA) comes into play. By graphically displaying the dependence between rows and columns, CA enables the analyst to explore the data in search of a meaningful inner structure. The article aims to show the utility of CA in archaeology in general and, in particular, for the identification of areas de voted to different activities within settlements. The application of CA to the data from a prehistoric village in north-eastern Sicily (P. Milazzese at Panarea, Aeolian Islands-Italy), taken as a case study, allows to show how CA succeeds in pinpointing different activity areas and in providing grounds to open new avenues of inquiry into other aspects of the archaeological doc umentation.
Abstract: Markov chain Monte Carlo simulation techniques enable the ap plication of Bayesian methods to a variety of models where the posterior density of interest is too difficult to explore analytically. In practice, how ever, multivariate posterior densities often have characteristics which make implementation of MCMC methods more difficult. A number of techniques have been explored to help speed the convergence of a Markov chain. This paper presents a new algorithm which employs some of these techniques for cases where the target density is bounded. The algorithm is tested on sev eral known distributions to empirically examine convergence properties. It is then applied to a wildlife disease model to demonstrate real-world appli cability.
Abstract: Supervised classifying of biological samples based on genetic information, (e.g., gene expression profiles) is an important problem in biostatistics. In order to find both accurate and interpretable classification rules variable selection is indispensable. This article explores how an assessment of the individual importance of variables (effect size estimation) can be used to perform variable selection. I review recent effect size estimation approaches in the context of linear discriminant analysis (LDA) and propose a new conceptually simple effect size estimation method which is at the same time computationally efficient. I then show how to use effect sizes to perform variable selection based on the misclassification rate, which is the data independent expectation of the prediction error. Simulation studies and real data analyses illustrate that the proposed effect size estimation and variable selection methods are com petitive. Particularly, they lead to both compact and interpretable feature sets. Program files to be used with the statistical software R implementing the variable selection approaches presented in this article are available from my homepage: http://b-klaus.de.
Abstract: Modeling the Internet has been an active research in the past ten years. From the “rich get richer” behavior to the “winners don’t take all” property, the models depend on the explicit attributes described in the net work. This paper discusses the modeling of non-scale-free network subsets like bulletin forums. A new evolution mechanism, driven by some implicit at tributes “hidden” in the network, leads to a slightly increase in the page sizes of front rank forum. Due to the complication of quantifying these implicit attributes, two potential models are suggested. The first model introduces a content ratio and it is patched to the lognormal model, while the second model truncates the data into groups according to their regional specialties and data within groups are fitted by power-law models. A Taiwan-based bulletin forum is used for illustration and data are fitted via four models. Statistical Diagnostics show that two suggested models perform better than the traditional models in data fitting and predictions. In particular, the second model performs better than the first model in general.
Abstract: We apply methodology robust to outliers to an existing event study of the effect of U.S. financial reform on the stock markets of the 10 largest world economies, and obtain results that differ from the original OLS results in important ways. This finding underlines the importance of han dling outliers in event studies. We further review closely the population of outliers identified using Cook’s distance and find that many of the out liers lie within the event windows. We acknowledge that those data points lead to inaccurate regression fitting; however, we cannot remove them since they carry valuable information regarding the event effect. We study further the residuals of the outliers within event windows and find that the resid uals change with application of M-estimators and MM-estimators; in most cases they became larger, meaning the main prediction equation is pulled back towards the main data population and further from the outliers and indicating more proper fitting. We support our empirical results by pseudo simulation experiments and find significant improvement in determination of both types of the event effect − abnormal returns and change in systematic risk. We conclude that robust methods are important for obtaining accurate measurement of event effects in event studies.