On Estimation of Rayleigh Scale Parameter under Doubly Type-II Censoring from Imprecise Data

The scheme of doubly type-II censored sampling is an important method of obtaining data in lifetime studies. Statistical analysis of lifetime distributions under this censoring scheme is based on precise lifetime data. However, some collected lifetime data might be imprecise and are represented in the form of fuzzy numbers. This paper deals with the problem of estimating the scale parameter of Rayleigh distribution under doubly type-II censoring scheme when the lifetime observations are fuzzy and are assumed to be related to underlying crisp realization of a random sample. We propose a new method to determine the maximum likelihood estimate of the parameter of interest. The asymptotic variance of the ML estimate is then derived by using the missing information principle. Their performance is then assessed through Monte Carlo simulations. Finally, an illustrative example with real data concerning 25 ball bearings in a life test is presented.


Introduction
In many life testing experiments, the experimenter may not observe the lifetimes of all inspected units in the life test.This may be because of time limitation and/or other restrictions (such as money and material resources, etc) on data collection.Censored data arises in these situations wherein the experimenter does not obtain complete information for all the units under study.Different types of censoring arise based on how the data are collected from the life-testing experiment.The scheme of doubly type-II censored sampling is an important method of obtaining data in life testing experiments.It can be described as follows.Consider a life testing experiment in which n identical units are placed on test simultaneously.The first r lifetimes may be left-censored due to negligence or problems at the beginning of the experiment, and the experiment terminates as soon as the mth unit failed.Then the data constitute a type-II doubly censored sample.Some of the earlier work on doubly censored samples was conducted by Harter and Moore (1968) and Lalitha and Mishra (1996).Fernandez (2000) discussed Bayesian inference from type-II doubly censored Rayleigh data.Lin and Balakrishnan (2003) obtained exact prediction intervals for failure times from oneparameter and two-parameter exponential distributions based on doubly type-II censored samples.Wu (2008) discussed interval estimation for Pareto distribution based on a doubly type-II censored sample.Kim and Song (2010) considered the problem of estimating parameters and reliability function of the generalized exponential distribution, based on type II doubly censored sample using Bayesian viewpoints.
The above inference techniques for estimating parameters from different lifetime distributions under doubly type-II censored data are based on precise lifetime data.However, in real situations, lifetime of units sometimes can not be recorded or measured precisely due to machine errors, human errors or some unexpected situations.For instance, the lifetime observations may be reported as imprecise quantities such as: "about 1000h", "approximately 1400h", "almost between 1000h and 1200h", "essentially less than 1200h", and so on.The lack of precision of lifetime data may be described using fuzzy sets.The classical statistical estimation methods are not appropriate to deal with such imprecise cases.Therefore we need suitable statistical methodology to handle these data as well.
In this paper, our objective is to devise the method for parameter estimation regarding a life-test from which the doubly type-II censored data are reported in the form of fuzzy numbers.We analyze the data under the assumption that the lifetimes of the test units are independent identically distributed Rayleigh random variables.In Section 2, we first present in greater detail the problem addressed in this paper.Some preliminary concepts about fuzzy numbers is included in this section.In Section 3, we introduce a generalization of the likelihood function under doubly type-II censoring and obtain the maximum likelihood estimate of the parameter of interest.Then by using the missing information principle, in Section 4 we compute the asymptotic variance of the ML estimate.In Section 5, simulation study will be carried out to assess the performance of the proposed methods and a real data from the life testing experiment provided by Caroni (2002) will be studied.

Problem Description
The Rayleigh distribution is a special case of the Weibull distribution, which provides a population model useful in several areas of statistics, including life testing and reliability which age with time as its failure rate is a linear function of time.Bhattacharya and Tyagi (1990) mentioned that in some clinical studies dealing with cancer patients, the survival pattern follows the Rayleigh distribution.Dyer and Whisenand (1973) demonstrated the importance of this distribution in communication engineering.A number of authors have considered the problem of estimation of the scale parameter of Rayleigh distribution using different types of censoring and non-censoring data.Among others, Raqab and Madi (2002) considered the estimation of the predictive distribution of the total time on test up to a certain failure in a future sample on the basis of a doubly censored random sample of failure times drawn from a Rayleigh distribution.Kim and Han (2009) discussed estimation of the scale parameter of the Rayleigh distribution under general progressive censoring.Lee et al. (2011) obtained a Bayes estimator under the Rayleigh distribution with the progressive Type II right censored sample.
The density, reliability and hazard (failure rate) functions of the Rayleigh (σ) distribution are given, respectively, by and cumulative distribution function (cdf) It is clear from (3) that the pdf of Rayleigh has a linearly increasing failure rate which makes it a suitable model for the lifetime of components that age rapidly with time.Consider a reliability experiment in which n identical units are placed on a life-test.Let X 1 , • • • , X n denote the lifetimes of these experimental units.Assume that these variables are independent and identically distributed as Rayleigh (σ).Prior to the experiment, the number m is specified such that 0 ≤ r < m ≤ n.Let x = (x r+1 , • • • , x m ) denote a doubly type-II censored sample from the population given in (1).The likelihood function for the parameter σ becomes proportional to where The maximum likelihood estimator (MLE) of σ can be determined by numerical methods from the equation Precisely reported lifetimes are common when data comes from specially designed life tests.In such a case a failure should be precisely defined, and all tested items should be continuously monitored.However, in real situations these test requirements might not be fulfilled.In these cases, it is sometimes impossible to obtain exact observations of lifetime.The obtained lifetime data may be imprecise most of the time.The lack of precision of lifetime data may be described using fuzzy sets.In the following, we recall the main definitions of fuzzy sets and some of the formula used in this paper.
Consider an experiment characterized by a probability space X = (X, B X , P θ ), where (X, B X ) is a measurable space and P θ belongs to a specified family of probability measures {P θ , θ ∈ Θ} on (X, B X ).A fuzzy set Ã in X is characterized by a membership function µ Ã(x) which associates with each point x in X a real number in the interval [0, 1], with the value of µ Ã(x) at x representing the "grade of membership"of x in Ã.The notion of probability was extended to fuzzy events by Zadeh (1968) as follows.
Definition 1.Let (R n , A, P ) be a probability space in which A is the σ-field of Borel sets in R n and P is a probability measure over R n .Then, a fuzzy event in R n is a fuzzy subset Ã of R n , whose membership function µ Ã is Borel measurable.The probability of a fuzzy event Ã is defined by: In particular, assume that P is the probability distribution of a continuous random variable Y with p.d.f.g(Y ).The conditional density of Y given Ã is given by The set consisting of all observable events from the experiment X determines a fuzzy information system associated with it, which is defined as follows.
Definition 2. (Tanaka et al., 1979).A fuzzy information system (f.i.s.) X associated with the experiment X is a fuzzy partition , a set of K fuzzy events on X satisfying the orthogonality condition where µ xk denotes the membership function of xk .
We now examine a brief example illustrating the preceding concept: Example 1.An investigator is interested in analyzing the amount of an adverse substance extracted from a special brand of cigarettes (experiment X).Assume that the investigator has not a mechanism of measurement sufficiently precise to determine exactly the amount of the adverse substance of cigarettes, but rather he can only approximate them by means of the following fuzzy observations: x1 = "approximately lower than 10 milligrams", x2 = "approximately 15 to 20 milligrams", x3 = "approximately 25 milligrams", x4 = "approximately 30 milligrams", x5 = "approximately 35 to 40 milligrams", x6 = "approximately higher than 45 milligrams", which are characterized by the membership functions in Figure 1 (Clearly, a f.i.s.In order to model imprecise lifetimes, a generalization of real numbers is necessary.These lifetimes can be represented by fuzzy numbers.A fuzzy number is a subset, denoted by x, of the set of real numbers (denoted by R) and is characterized by the so called membership function µ x(•).Fuzzy numbers satisfy the following constraints (Dubois and Prade, 1980): (1) With the definition of a fuzzy number given above, an exact (non-fuzzy) number can be treated as a special case of a fuzzy number.For a non-fuzzy real observation x 0 ∈ R, its corresponding membership function is µ x 0 (x 0 ) = 1.
Usually, LR-type fuzzy numbers (the triangular and trapezoidal fuzzy numbers are special cases of the LR-type fuzzy numbers) are most convenient and useful in describing fuzzy lifetime observations.Therefore, we shall focus on the set of LR-type fuzzy numbers.Definition 3. (Zimmermann, 1991, p. 62).Let L (and R) be decreasing, shape functions from R + to [0, 1] with L(0) = 1; L(x) < 1 for all x > 0; L(x) > 0 for all x < 1; L(1) = 0 or (L(x) > 0 for all x and L(+∞) = 0).Then a fuzzy number Example 2. Assume that n identical batteries are placed on a test, and we are interested in the lifetime of these batteries.The unknown lifetime x i of battery i may be regarded as a realization of a random variable X i induced by random sampling from a total population of batteries.A tested battery may be considered as failed, or -strictly speaking-as nonconforming, when at least one value of its parameters falls beyond specification limits.In practice, however, the observer does not have the possibility to measure all parameters and is not able to define precisely the moment of a failure.So, he/she determines two intervals for the lifetime of each battery i: This information may be encoded as a trapeozoidal fuzzy number xi = (a i , b i , c i , d i ) with support [a i , d i ] and core [b i , c i ], interpreted as a possibility distribution constraining the unknown value x i .
In the next section, we will introduce a generalization of the likelihood function based on the doubly type-II censoring when the lifetime observations are reported in the form of LR-type fuzzy numbers.The maximum likelihood estimate of the scale parameter of Rayleigh model will then be obtained using EM algorithm.

Maximum Likelihood Estimation
Suppose that n independent units are put on a test and that the lifetime distribution of each unit is given by (1).Now consider the problem where under the doubly type-II censoring scheme, failure times are not observed precisely and only partial information about them are available in the form of fuzzy numbers denote the ordered values of the means of these fuzzy numbers.The lifetime of the first r missing units can be modeled by fuzzy numbers ỹ1 , • • • , ỹr with the membership functions Also the lifetime of n − m surviving units, which are censored from the test after the mth failure, can be encoded as fuzzy numbers zm+1 , • • • , zn with the membership functions The fuzzy data w is thus the set of observed lifetimes.The corresponding observed-data log-likelihood function can be obtained using the definition of the probability of a fuzzy event as and subsequently the associated gradient is found to be To achieve estimation via ML method, it is not easy to solve the equation ∂L O ( w; σ)/∂σ = 0, directly.In the following, Theorem 1 discusses the existence and uniqueness of the MLE of σ.
Theorem 1.Under the doubly type-II censoring, the MLE of the scale parameter σ of a Rayleigh population exists and is unique.
Proof.Let θ = 1/σ.Due to the invariance property of MLEs, we will show the existence and uniqueness of the MLE of θ instead of σ.The log-likelihood function L = L O ( w; θ) based on a doubly type-II censored sample can be written as Differentiating of ( 11) with respect to θ yields Let g(θ) denote the function on the right-hand side of the expression in ( 12).Then it is easily seen that Therefore, the equation g(θ) = 0 has at least one root.To prove that the root is unique, we consider the first derivative of g, g´(θ), given by Then g´(θ) can be written as It is clearly that s(θ) is a log-concave function of θ, and this implies that w 1 (θ) and w i (θ) are also log-concave in θ (see Prekopa-Leindler inequality in the Appendix).
It follows that g is a strictly decreasing function w.r.t.θ and hence the equation g(θ) = 0 has exactly one solution.2 The maximum likelihood estimate of σ must be derived numerically.Expectation Maximization (EM) algorithm has emerged as a very important tool for estimating the parameters involved in a model, especially when the available data are incomplete.One advantage of the EM algorithm is that asymptotic variances and covariances of the EM algorithm estimates can be computed, which is discussed in Section 4. Since the observed fuzzy data w can be seen as an incomplete specification of a complete data vector w, the EM algorithm is applicable to obtain the maximum likelihood estimate of the parameter.First of all, denote the lifetime of the missing, failed and censored units by respectively.The combination of (Y, X, Z) = W forms the complete lifetimes and the corresponding loglikelihood function is denoted by L(W ; σ).In the following, we use the fuzzy EM (FEM) algorithm (Denoeux, 2011) to determine the MLE of σ.The log-likelihood function based on the complete lifetimes W becomes proportional to To perform the E-step of the EM algorithm, we need to compute the conditional expectation of the complete-data log likelihood conditionally on the observed data w, using the current fit σ (h) of the parameter σ: By using the expression (8), the conditional expectations ) can be obtained as follows: The expected complete-data log likelihood can thus be written as: (16) The M-step then consists in finding σ (h+1) which maximizes E σ (h) (L(W ; σ) | w).This is easily achieved by solving the likelihood equation.From The MLE of σ can be obtained by repeating the E-and M-steps until the difference L O ( w; σ (h+1) ) − L O ( w; σ (h) ) becomes smaller than some arbitrary small amount.
It is showed in Denoeux (2011) that the observed-data log-likelihood L O ( w; σ) is not decreased after an EM iteration.Hence, convergence to some value L * is ensured as long as the sequence L O ( w; σ (h) ) for h = 0, 1, • • • , is bounded from above.
4. Asymptotic Variance of the MLE Louis (1982) developed a procedure for extracting the observed information matrix when the EM algorithm is used to find maximum likelihood estimates in an incomplete data problem.The idea of the procedure can be expressed by the missing information principle (Louis, 1982) as follows: Observed information = Complete information -Missing information.
We can use this procedure to compute variance of the maximum likelihood estimate generated through the EM algorithm.Let I w(σ), I W (σ) and I W | w(σ) denote the observed information, the complete information and the missing information, respectively.From the classical results on the Rayleigh distribution, the complete data information, I W (σ), is given by The expected information for conditional distribution of W given w (missing information) is in which and Using ( 8), the logarithm of the conditional distribution of Y k given ỹk becomes Differentiating of ( 23) with respect to σ yields where σ 2 e −y k 2 /(2σ 2 ) µ ỹk (y k )dy k .
From these fuzzy data and using the starting value σ (0) = ( 30 i=1 x 2 i /30) 1/2 = 48.0513which is the estimate of the parameter computed over the centers of each fuzzy numbers, the final MLE of σ is found from (17) to be σ = 55.1901.The complete information, the missing information and the observed information are I W (σ) = 0.0328, I W | w(σ) = 0.0066 and I w(σ) = 0.0261, respectively.By inverting the I w(σ), we have V ar(σ) = 38.1882.

Conclusion
Although the maximum likelihood estimation of the parameter of Rayleigh distribution based on censored data has been studied extensively, traditionally it is assumed that the data available are performed in exact numbers.In real

Figure 1 :
Figure 1: Membership functions of the fuzzy observations x1 , x2 , x3 , x4 , x5 and x6 where c is called the mean value of x and a and b are called the left and right spreads, respectively.Symbolically, the LR-type fuzzy number is denoted by x = (a, c, b).

Table 1 :
Averages and variances of the MLE, variances from averaged observed Fisher information and average number of iterations (AI) for different sample sizes