We introduce a new class of distributions called the generalized odd generalized exponential family. Some of its mathematical properties including explicit expressions for the ordinary and incomplete moments, quantile and generating functions, R𝑒́nyi, Shannon and q-entropies, order statistics and probability weighted moments are derived. We also propose bivariate generalizations. We constructed a simple type Copula and intro-duced a useful stochastic property. The maximum likelihood method is used for estimating the model parameters. The importance and flexibility of the new family are illustrated by means of two applications to real data sets. We assess the performance of the maximum likelihood estimators in terms of biases and mean squared errors via a simulation study.
In this paper, we consider functional varying coefficient model in present of a time invariant covariate for sparse longitudinal data contaminated with some measurement errors. We propose a regularization method to estimate the slope function based on a reproducing kernel Hilbert space approach. As we will see, our procedure is easy to implement. Our simulation results show that the procedure performs well, especially when either sampling frequency or sample size increases. Applications of our method are illustrated in an analysis of a longitudinal CD4+ count dataset from an HIV study.
Abstract: Multiple imputation under the multivariate normality assumption has often been regarded as a viable model-based approach in dealing with incomplete continuous data. Considering the fact that real data rarely conform with normality, there has been a growing attention to generalized classes of distributions that cover a broader range of skewness and elongation behavior compared to the normal distribution. In this regard, two recent works have shown that creating imputations under Fleishman’s power polynomials and the generalized lambda distribution may be a promising tool. In this article, essential distributional characteristics of these families are illustrated along with a description of how they can be used to create multiply imputed data sets. Furthermore, an application is presented using a data example from psychiatric research. Multiple imputation under these families that span most of the feasible area in the symmetry-peakedness plane appears to have substantial potential of capturing real missing-data trends that can be encountered in clinical practice.
Abstract: According to 2006 Programme for International Student Assess ment (PISA), sixteen Organization for Economic Cooperation and Develop ment (OECD) countries had scores that were significantly higher than the US. The top three performers were Finland, Canada, and Japan. While Finland and Japan are vastly different from the US in terms of cultures and educational systems, the US and Canada are similar to each other in many aspects, thus their performance gap was investigated. In this study data mining was employed to identify factors regarding access to and use of resources, as well as student views on science for predicting PISA science scores among Grade 10 American and Canadian students. It was found that science enjoyment and frequent use of educational software play important roles in the academic achievement of Canadian students.
Abstract: It is shown that the most popular posterior distribution for the mean of the normal distribution is obtained by deriving the distribution of the ratio X/Y when X and Y are normal and Student’s t random variables distributed independently of each other. Tabulations of the associated percentage points are given along with a computer program for generating them.
Abstract: In alcohol studies, drinking outcomes such as number of days of any alcohol drinking (DAD) over a period of time do not precisely capture the differences among subjects in a study population of interest. For example, the value of 0 on DAD could mean that the subject was continually abstinent from drinking such as lifetime abstainers or the subject was alcoholic, but happened not to use any alcohol during the period of interest. In statistics, zeros of the first kind are called structural zeros, to distinguish them from the sampling zeros of the second type. As the example indicates, the structural and sampling zeros represent two groups of subjects with quite different psychosocial outcomes. In the literature on alcohol use, although many recent studies have begun to explicitly account for the differences between the two types of zeros in modeling drinking variables as a response, none has acknowledged the implications of the different types of zeros when such modeling drinking variables are used as a predictor. This paper serves as the first attempt to tackle the latter issue and illustrate the importance of disentangling the structural and sampling zeros by using simulated as well as real study data.
Abstract: Conventional sampling in biostatistics and economics posits an individual in a fixed observable state (e.g., diseased or not, poor or not, etc.). Social, market, and opinion research, however, require a cognitive sampling theory which recognizes that a respondent has a choice between two options (e.g., yes versus no). This new theory posits the survey re spondent as a personal probability. Once the sample is drawn, a series of independent non-identical Bernoulli trials are carried out. The outcome of each trial is a momentary binary choice governed by this unobserved proba bility. Liapunov’s extended central limit theorem (Lehmann, 1999) and the Horvitz-Thompson (1952) theorem are then brought to bear on sampling unobservables, in contrast to sampling observations. This formulation reaf firms the usefulness of a weighted sample proportion, which is now seen to estimate a different target parameter than that of conventional design-based sampling theory