Abstract: The primary advantage of panel over cross-sectional regression stems from its control for the effects of omitted variables or ”unobserved heterogeneity”. However, panel regression is based on the strong assump tions that measurement errors are independently identically ( i.i.d.) and normal. These assumptions are evaded by design-based regression, which dispenses with measurement errors altogether by regarding the response as a fixed real number. The present paper establishes a middle ground between these extreme interpretations of longitudinal data. The individual is now represented as a panel of responses containing dependently non-identically distributed (d.n.d) measurement errors. Modeling the expectations of these responses preserves the Neyman randomization theory, rendering panel regression slopes ap proximately unbiased and normal in the presence of arbitrarily distributed measurement error. The generality of this reinterpretation is illustrated with German Socio-Economic Panel (GSOEP) responses that are discretely distributed on a 3-point scale.
Abstract: This article extends the recent work of V¨annman and Albing (2007) regarding the new family of quantile based process capability indices (qPCI) CMA(τ, v). We develop both asymptotic parametric and nonparametric confidence limits and testing procedures of CMA(τ, v). The kernel density estimator of process was proposed to find the consistent estimator of the variance of the nonparametric consistent estimator of CMA(τ, v). Therefore, the proposed procedure is ready for practical implementation to any processes. Illustrative examples are also provided to show the steps of implementing the proposed methods directly on the real-life problems. We also present a simulation study on the sample size required for using asymptotic results.
In this paper, we introduce a new generalized family of distri- butions from bounded support (0,1), namely, the Topp-Leone-G family.Some of mathematical properties of the proposed family have been studied. The new density function can be symmetrical, left-skewed, right-skewed or reverse-J shaped. Furthermore, the hazard rate function can be constant, in- creasing, decreasing, J or bathtub hazard rate shapes. Three special models are discussed. We obtain simple expressions for the ordinary and incomplete moments, quantile and generating functions, mean deviations and entropies. The method of maximum likelihood is used to estimate the model parame- ters. The flexibility of the new family is illustrated by means of three real data sets.
Abstract: In the United States, diabetes is common and costly. Programs to prevent new cases of diabetes are often carried out at the level of the county, a unit of local government. Thus, efficient targeting of such programs re quires county-level estimates of diabetes incidence−the fraction of the non diabetic population who received their diagnosis of diabetes during the past 12 months. Previously, only estimates of prevalence−the overall fraction of population who have the disease−have been available at the county level. Counties with high prevalence might or might not be the same as counties with high incidence, due to spatial variation in mortality and relocation of persons with incident diabetes to another county. Existing methods cannot be used to estimate county-level diabetes incidence, because the fraction of the population who receive a diabetes diagnosis in any year is too small. Here, we extend previously developed methods of Bayesian small-area esti mation of prevalence, using diffuse priors, to estimate diabetes incidence for all U.S. counties based on data from a survey designed to yield state-level estimates. We found high incidence in the southeastern United States, the Appalachian region, and in scattered counties throughout the western U.S. Our methods might be applicable in other circumstances in which all cases of a rare condition also must be cases of a more common condition (in this analysis, “newly diagnosed cases of diabetes” and “cases of diabetes”). If ap propriate data are available, our methods can be used to estimate proportion of the population with the rare condition at greater geographic specificity than the data source was designed to provide.
Abstract: Complexities involved with identifying the projection for a specific set of k factors (k = 2,..., 11) from an n-run (n = 12, 20 or 24) Plackett Burman design are described. Once the correct projection is determined, difficulties with selecting the necessary additional runs to complete either the full or half fraction factorial for the respective projection are noted, especially for n = 12, 20 or 24 and k = 4 or 5. Because of these difficulties, a user-friendly computational approach that identifies the projection and corresponding necessary follow-up runs to complete the full or half fraction factorial is given. The method is illustrated with a real data example.
Abstract: We have studied the effect of several factors that influence recombinant protein production, by using the expression of recombinant streptolysin-O as our model. This protein, produced by Streptococcus pyogenes, is important in the biotechnological industry, where it is used to produce immunodiagnostic reagents. In order to improve the yield of this protein, we tried an alternative production method using strains of Escherichia coli and recombinant DNA technology. We have evaluated this method at the laboratory scale, taking into account factors such as inductor concentration, temperature of induction, proportion of culture medium volume to total flask volume, and strain of Escherichia coli used. To this end we applied techniques of experimental design, particularly a “fixed-effects bifactorial design”, with the expression level of recombinant streptolysin-O in E. coli being the response to the factors. All the effects studied were found to be significant and relevant to the economics of the protein production.
Abstract: We introduce a new class of continuous distributions called the Ku maraswamy transmuted-G family which extends the transmuted class defined by Shaw and Buckley (2007). Some special models of the new family are provided. Some of its mathematical properties including explicit expressions for the ordinary and incomplete moments, generating function, Rényi and Shannon entropies, order statistics and probability weighted moments are derived. The maximum likelihood is used for estimating the model parameters. The flexibility of the generated family is illustrated by means of two applications to real data sets.
Abstract: HIV (Human Immunodeficiency Virus) researchers are often con cerned with the correlation between HIV viral load measurements and CD4+ lymphocyte counts. Due to the lower limits of detection (LOD) of the avail able assays, HIV viral load measurements are subject to left-censoring. Mo tivated by these considerations, the maximum likelihood (ML) method under normality assumptions was recently proposed for estimating the correlation between two continuous variables that are subject to left-censoring. In this paper, we propose a generalized estimating equations (GEE) approach as an alternative to estimate such a correlation coefficient. We investigate the robustness to the normality assumption of the ML and the GEE approaches via simulations. An actual HIV data example is used for illustration.