Abstract: Relative entropy identities yield basic decompositions of cat egorical data log-likelihoods. These naturally lead to the development of information models in contrast to the hierarchical log-linear models. A recent study by the authors clarified the principal difference in the data likelihood analysis between the two model types. The proposed scheme of log-likelihood decomposition introduces a prototype of linear information models, with which a basic scheme of model selection can be formulated accordingly. Empirical studies with high-way contingency tables are exem plified to illustrate the natural selections of information models in contrast to hierarchical log-linear models.
Abstract: Accelerated degradation tests (ADTs) can provide timely relia bility information of product. Hence ADTs have been widely used to assess the lifetime distribution of highly reliable products. In order to properly predict the lifetime distribution, modeling the product’s degradation path plays a key role in a degradation analysis. In this paper, we use a stochastic diffusion process to describe the product’s degradation path and a recursive formula for the product’s lifetime distribution can be obtained by using the first passage time (FPT) of its degradation path. In addition, two approxi mate formulas for the product’s mean-time-to-failure (MTTF) and median life (B50) are given. Finally, we extend the proposed method to the case of ADT and a real LED data is used to illustrate the proposed procedure. The results demonstrate that the proposed method has a good performance for the LED lifetime prediction.
Abstract: Datasets are sometimes encountered that consist of a two-way table of 0’s and 1’s. For example, this might show which patients are im paired on which of a battery of tests, or which compounds are successful at inactivating which of several micro-organisms. The present paper describes a method of analysing such tables, that reveals and specifies two (or more) systems or modes of action, if indeed they are needed to explain the data. The approach is an extension of what, in the context of cognitive impair ments, is termed double dissociation. In order to be simple enough to be practicable, the approach is deterministic rather than probabilistic.
Abstract: Methods for testing the equality of two means are of critical importance in many areas of applied statistics. In the microarray context, it is often necessary to apply this kind of testing to small samples containing no more than a dozen elements, when inevitably the power of these tests is low. We suggest an augmentation of the classical t-test by introducing a new test statistic which we call “bio-weight.” We show by simulation that in practically important cases of small sample size, the test based on this statistic is substantially more powerful than that of the classical t-test. The power computations are accompanied by ROC and FDR analysis of the simulated microarray data.
Abstract: The creation of data sets using observational methods for the lag-sequential study of behavior requires selection of a recording time unit. This is an important issue, because standard methods such as momentary sampling and partial-interval sampling, for instance, consistently underestimate the frequency of some behaviors. This leads to inaccurate estimation of both unconditional and conditional probabilities of the different behaviors, the basic descriptive and analytic tools of sequential analysis methodology. The purpose of this paper is to investigate the creation of data sets usable for the purpose of sequential analysis. We show that such data vary depending on the time resolution and that inaccurate choices lead to biased estimations of transition probabilities.
Abstract: A new rank-based test statistics are proposed for the problem of a possible change in the distribution of independent observations. We extend the two-sample test statistic of Damico (2004) to the change point setup. The finite sample critical values of the proposed tests is estimated. We also conduct a Monte Carlo simulation to compare the powers of the new tests with their competitors. Using the Nile data of Cobb (1978), we demonstrate the applicability of the new tests.
Abstract: Stochastic modeling and analysis of international key compar isons (interlaboratory comparisons) pose several fundamental questions for statistical methodology. A key comparison (KC) is specifically designed to derive the key comparison reference value and to assess conformance of cal ibrations by participating national metrology laboratories at a few, “key”, settings for a particular measurement process. An approach to the statis tical study of key comparisons data is proposed using a model taken from meta-analysis. This model leads to a class of weighted means estimators for the consensus value and to a method of assessing the uncertainty of the resulting estimates.
Abstract: It is believed that overdispersion or extravariation as often re ferred is present more in survey data due to the existence of heterogeneity among and between the units. One approach to address such a phenomenon is to use a generalized Dirichlet-multinomial model. In its application the generalized Dirichlet-multinomial model assumes that the clusters are of equal sizes and the number of clusters remains the same from time to time. In practice this may rarely ever be the case when clusters are observed over time. In this paper the random variability and the varying response rates are accounted for in the model. This requires modeling another level of variation. In effect, this can be considered a hierarchical model that allows varying response rates in the presence of overdispersed multinomial data. The model and its applicability are demonstrated through an illustrative application to a subset of the well known High School and Beyond survey data.
Abstract: High density oligonucleotide arrays have become a standard research tool to monitor the expression of thousands of genes simultaneously. Affymetrix GeneChip arrays are the most popular. They use short oligonucleotides to probe for genes in an RNA sample. However, important challenges remain in estimating expression level from raw hybridization in tensities on the array. In this paper, we deal with the problem of estimating gene expression based on a statistical model. The present method is like Li and Wong model (2001a), but assumes more generality. More precisely, we show how the model introduced by Li and Wong can be generalized to provide new measure of gene expression. Moreover, we provide a comparison between these two models.
Abstract: It is shown that the most popular posterior distribution for the mean of the normal distribution is obtained by deriving the distribution of the ratio X/Y when X and Y are normal and Student’s t random variables distributed independently of each other. Tabulations of the associated percentage points are given along with a computer program for generating them.