Hierarchical Bayes models have been used in disease mapping to examine small scale geographic variation. State level geographic variation for less common causes of mortality outcomes have been reported however county level variation is rarely examined. Due to concerns about statistical reliability and confidentiality, county-level mortality rates based on fewer than 20 deaths are suppressed based on Division of Vital Statistics, National Center for Health Statistics (NCHS) statistical reliability criteria, precluding an examination of spatio-temporal variation in less common causes of mortality outcomes such as suicide rates (SRs) at the county level using direct estimates. Existing Bayesian spatio-temporal modeling strategies can be applied via Integrated Nested Laplace Approximation (INLA) in R to a large number of rare causes of mortality outcomes to enable examination of spatio-temporal variations on smaller geographic scales such as counties. This method allows examination of spatiotemporal variation across the entire U.S., even where the data are sparse. We used mortality data from 2005- 2015 to explore spatiotemporal variation in SRs, as one particular application of the Bayesian spatio-temporal modeling strategy in R-INLA to predict year and county-specific SRs. Specifically, hierarchical Bayesian spatio-temporal models were implemented with spatially structured and unstructured random effects, correlated time effects, time varying confounders and space-time interaction terms in the software RINLA, borrowing strength across both counties and years to produce smoothed county level SRs. Model-based estimates of SRs were mapped to explore geographic variation.
Abstract: Lifestyles can be used to explain existent and to anticipate future consumer behavior, both in a geographical and a temporal context. Basing market segmentations on consumer lifestyles enables the development of purposeful advertising strategies and the design of new products meeting future demands. The present paper introduces a new growing self-organizing neural network which identifies lifestyles, or rather consumer types, in survey data largely autonomously. Before applying the algorithm to real marketing data we are going to demonstrate its general performance and adaptability by means of synthetic 2D data featuring distinct heterogeneity with respect to the arrangement of the individual data points.
As a robust data analysis technique, quantile regression has attracted extensive interest. In this study, the weighted quantile regression (WQR) technique is developed based on sparsity function. We first consider the linear regression model and show that the relative efficiency of WQR compared with least squares (LS) and composite quantile regression (CQR) is greater than 70% regardless of the error distributions. To make the pro- posed method practically more useful, we consider two nontrivial extensions. The first concerns with a nonparametric model. Local WQR estimate is introduced to explore the nonlinear data structure and shown to be much more efficient compared to other estimates under various non-normal error distributions. The second extension concerns with a multivariate problem where variable selection is needed along with regulation. We couple the WQR with penalization and show that under mild conditions, the penalized WQR en- joys the oracle property. The WQR has an intuitive formulation and can be easily implemented. Simulation is conducted to examine its finite sample performance and compare against alternatives. Analysis of mammal dataset is also conducted. Numerical studies are consistent with the theoretical findings and indicate the usefulness of WQR
Abstract: Watching videos online has become a popular activity for people around the world. To be able to manage revenue from online advertising an efficient Ad server that can match advertisement to targeted users is needed. In general the users’ demographics are provided to an Ad server by an inference engine which infers users’ demographics based on a profile reasoning technique. Rich media streaming through broadband networks has made significant impact on how online television users’ profiles reasoning can be implemented. Compared to traditional broadcasting services such as satellite and cable, broadcasting through broadband networks enables bidirectional communication between users and content providers. In this paper, a user profile reasoning technique based on a logistic regression model is introduced. The inference model takes into account genre preferences and viewing time from users in different age/gender groups. Historical viewing data were used to train and build the model. Different input data processing and model building strategies are discussed. Also, experimental results are provided to show how effective the proposed technique is.
Abstract: Existing methods on sample size calculations for right-censored data largely assume the failure times follow exponential distribution or the Cox proportional hazards model. Methods under the additive hazards model are scarce. Motivated by a well known example of right-censored failure time data which the additive hazards model fits better than the Cox model, we proposed a method for power and sample size calculation for a two-group comparison assuming the additive hazards model. This model allows the investigator to specify a group difference in terms of a hazard difference and choose increasing, constant or decreasing baseline hazards. The power computation is based on the Wald test. Extensive simulation studies are performed to demonstrate the performance of the proposed approach. Our simulation also shows substantially decreased power if the additive hazards models is misspecified as the Cox proportional hazards model.
Abstract: Central composite design (CCD) is widely applied in many fields to construct a second-order response surface model with quantitative factors to help to increase the precision of the estimated model. When an experiment also includes qualitative factors, the effects between the quantitative and qualitative factors should be taken into consideration. In the present paper, D-optimal designs are investigated for models where the qualitative factors interact with, respectively, the linear effects, or the linear effects and 2-factor interactions or quadratic effects of the quantitative factors. It is shown that, at each qualitative level, the corresponding D-optimal design also consists of three portions as CCD, i.e. the cube design, the axial design and center points, but with different weights. An example about a chemical study is used to demonstrate how the D-optimal design obtained here may help to design an experiment with both quantitative and qualitative factors more efficiently.
Abstract: The power law process has been used extensively in software reliability models, reliability growth models and more generally reliable systems. In this paper we work on the Power Law Process via empirical Bayes (EB) approach. Based on a two-hyperparameter natural conjugate prior and a more generalized three-hyperparameter natural conjugate prior, which was stated in Huang and Bier (1998), we work out an empirical Bayes (EB) procedure and provide statistical inferences based on the natural conjugate priors. Given past experience about the parameters of the model, the empirical Bayes (EB) approach uses the observed data to estimate the hyperparamters of priors and then proceeds as though the prior were known.
Abstract: This paper studies an effective stepwise hypotheses testing pro cedure in identifying dynamic relations between time series, and its close connection with popular information criteria such as AIC and BIC. This procedure, labeled M2, extends Chen and Lee’s (1990) procedure to cover both the strong and weak form dynamic relations; and to be used with a guided choice of significance levels which are adapting in nature. Intu itively, procedure M2 can be viewed as a backward-elimination approach that simplifies the all-possible pairwise comparisons approach implied by information criterion. New insights concerning identification of strong and weak form dynamic relations using these approaches are given. Extensive simulation experiments are conducted to illustrate the performance of the IC and M2 approach in different settings. For applications, we study the dynamic relations between price level and interest rate in US and UK, and the robustness of the model identified is also addressed.
Abstract: The study of pattern of female child birth is one of the most crucial area of human demography because it plays very important role in the building of a nation. In the present study, an attempt has been made to work-out the pattern of female child births among females belongs to different subdomains of population through the probability model and the parameters involved in the probability model under consideration has also been estimated. The suggested model, for illustration has been applied to an observed set of data taken from NFHS-III (2005-06) for the seven North East states of India known as Seven Sisters.
Abstract: This paper describes a statistical model developing from Cor respondence Analysis to date archaeological contexts of the city of Tours (France) and also to obtain an estimated absolute timescale. The data set used in the study is reported as a contingency table of ceramics against con texts. But, as pottery is not intrinsically a dating indicator (a date is rarely inscribed on each piece of pottery), we estimate dates of contexts from their finds, and we use coins to attest the date of assemblages. The model-based approach uses classical tools (correspondence analysis, linear regression and resampling methods) in an iterative scheme. Archaeologists may find in the paper a useful set of known statistical methods, while statisticians can learn a way to order well known techniques. No method is new, but their gathering is characteristic of this application