Abstract: The paper considers the problem of comparing measures of lo cation associated with two dependent groups when values are missing at random, with an emphasis on robust measures of location. It is known that simply imputing missing values can be unsatisfactory when testing hypothe ses about means, so the goal here is to compare several alternative strategies that use all of the available data. Included are results on comparing means and a 20% trimmed mean. Yet another method is based on the usual median but differs from the other methods in a manner that is made obvious. (It is somewhat related to the formulation of the Wilcoxon-Mann-Whitney test for independent groups.) The strategies are compared in terms of Type I error probabilities and power.
Abstract: For model selection in mixed effects models, Vaida and Blan chard (2005) demonstrated that the marginal Akaike information criterion is appropriate as to the questions regarding the population and the conditional Akaike information criterion is appropriate as to the questions regarding the particular clusters in the data. This article shows that the marginal Akaike information criterion is asymptotically equivalent to the leave-one-cluster-out cross-validation and the conditional Akaike information criterion is asymptotically equivalent to the leave-one-observation-out cross-validation.
Abstract: When there is a rare disease in a population, it is inefficient to take a random sample to estimate a parameter. Instead one takes a random sample of all nuclear families with the disease by ascertaining at least one affected sibling (proband) of each family. In these studies, an estimate of the proportion of siblings with the disease will be inflated. For example, studies of the issue of whether a rare disease shows an autosomal recessive pattern of inheritance, where the Mendelian segregation ratios are of interest, have been investigated for several decades. How do we correct for this ascertainment bias? Methods, primarily based on maximum likelihood estimation, are available to correct for the ascertainment bias. We show that for ascertainment bias, although maximum likelihood estimation is optimal under asymptotic theory, it can perform badly. The problem is exasperated in the situation where the proband probabilities are allowed to vary with the number of affected siblings. We use two data sets to illustrate the difficulties of maximum likelihood estimation procedure, and we use a simulation study to assess the quality of the maximum likelihood estimators.
Abstract: It is important to estimate transmissibility of influenza virus during its growing phase for understanding the propagation of the virus. The estimation procedures of the transmissibility are usually based on the data generated in flu seasons. The data-generating process of the outbreak of influenza has many features. The data is generated by not only a biological process but also control measures such as flu vaccination. The estimation is discussed by considering the aspects of the data-generating process and using the model to capture the essential characteristics of flu transmission during the growing phase of a flu season.
Abstract: Data collection for landslide susceptibility modelling is often an almost inhibitive activity. This has been the reason for quite sometimes land slide was described and modelled on the basis of spatially distributed values of landslide related attributes. This paper presents landslide susceptibility analysis at Selangor area, Malaysia, using artificial neural network model with the aid of remote sensing data and geographic information system (GIS) tools. To meet the objectives, landslide locations were identified in the study area from interpretation of aerial photographs and supported with extensive field surveys. Then, the landslide inventory was grouped into two categories: (1) training data (2) testing data. Further, topographical, geological data and satellite images were collected, processed, and constructed into a spatial database using GIS tools and image processing techniques. Nine landslide occurrence attributes were selected and analyzed using an artificial neural network model to generate the landslide susceptibility maps. Landslide loca tion data (training data) were used for training the neural network and five training sites were selected randomly in this case. The use of five training sites ensemble to investigate the model reliability, including the role of the thematic variables used to construct the model, and the model sensitivity to changes in the selection of the training sites. By studying the variation of the neural network’s susceptibility estimate, the error associated with the model is determined. The results of the neural network analysis are shown on five sets of landslide susceptibility maps. Then the susceptibility maps were validated using ”receiver operating characteristics (ROC)” method as a measure for the model verification. Landslide training data which were not used during the training of the neural network was used for the verification of the maps. The results of the analysis were verified using the landslide location data and compared between five different cases. Qualitatively, the model seems to give reasonable results with accuracy observed was 87%, 83%, 85%, 86% and 82% for five different training sites respectively.
Abstract: The study explored the association between the use of Internet services and quality of life in Taiwan. The use of broadband, wireless, and mobile Internet is found to be positively correlated with the people’s overall quality of life. The more the Internet services of e-Government are used, the higher the satisfaction with social-economic status and social competence. People using more Internet services in their daily activities also have higher self-esteem and less psychological pressures. However, people who deeply rely on Internet services for e-Business such as online shopping or ticket booking have lower satisfaction with community support.
Abstract: In public health, demography and sociology, large-scale surveys often follow a hierarchical data structure as the surveys are based on multistage stratified cluster sampling. The appropriate approach to analyzing such survey data is therefore based on nested sources of variability which come from different levels of the hierarchy. When the variance of the residual errors is correlated between individual observations as a result of these nested structures, traditional logistic regression is inappropriate. We use the 2004 Bangladesh Demographic and Health Survey (BDHS) contraceptive binary data which is a multistage stratified cluster dataset. This dataset is used to exemplify all aspects of working with multilevel logistic regression models, including model conceptualization, model description, understanding of the structure of required multilevel data, estimation of the model via the statistical package MLwiN, comparison between different estimations, and investigation of the selected determinants of contraceptive use.
Abstract: This paper is concerned with the change point analysis in a general class of distributions. The quasi-Bayes and likelihood ratio test procedures are considered to test the null hypothesis of no change point. Exact and asymptotic behaviors of the two test statistics are derived. To compare the performances of two test procedures, numerical significance levels and powers of tests are tabulated for certain selected values of the parameters. Estimation of the change point based on these two test procedures are also considered. Moreover, the epidemic change point problem is studied as an alternative model for the single change point model. A real data set with epidemic change model is analyzed by two test procedures.
Abstract: It is known that “standard methods for estimating the causal effect of a time-varying treatment on the mean of a repeated measures outcome (for example, GEE regression) may be biased when there are time-dependent variables that are simultaneously confounders of the effect of interest and are predicted by previous treatment” (Hern´an et al. 2002). Inverse-probability of treatment weighted (IPTW) methods are developed in the literature of causal inference. In genetic studies, however, the main interest is to estimate or test the genetic effect rather than the treatment effect. In this work, we describe an IPTW method that provides unbiased estimate for the genetic effect, and discuss how to develop a family-based association test using IPTW for family-based studies. We apply the developed methods to systolic blood pressure data in Framingham Heart Study, where some subjects took antihypertensive treatment during the course of study.