|Home | About | Journals | Submit | Contact Us | Français|
A unit is said to be randomly censored when the information on time occurrence of an event is not available due to either loss to followup, withdrawal, or nonoccurrence of the outcome event before the end of the study. It is assumed in independent random/noninformative censoring that each individual has his/her own failure time T and censoring time C; however, one can only observe the random vector, say, (X; δ). The classical approach is considered for analysing the generalised exponential distribution with random or noninformative censored samples which occur most often in biological or medical studies. The Bayes methods are also considered via a numerical approximation suggested by Lindley in 1980 and that of the Laplace approximation procedure developed by Tierney and Kadane in 1986 with assumed informative priors alongside linear exponential loss function and squared error loss function. A simulation study is carried out to compare the estimators proposed in this paper. Two datasets have also been illustrated.
A new distribution for analysing time-to-event data was introduced by , known as generalised exponential distribution. Generalised exponential distribution can be used as an alternative to the well-known and used Weibull distribution in lifetime data analysis and reliability engineering according to .
The generalised exponential distribution has the distribution, density, and survival functions, respectively, as
where p is the shape parameter and θ the scale parameter. Let the GE distribution with the shape parameter p and the scale parameter θ be denoted by GE(θ, p). According to , the two-parameter GE(θ, p) can be used quite effectively in analysing many lifetime data and can assume the place of the two-parameter gamma and two-parameter Weibull distributions. The two-parameter GE(θ, p) can have increasing and decreasing failure rates depending on the shape parameter.
Studies that involve time-to-event or survival data analysis are focussed on measuring time-to-event of an outcome. Time-to-event could vary from time to either death or the occurrence of a clinical endpoint such as disease or the attainment of a biochemical marker . A special course of difficulty in the analysis of time-to-event data is the possibility that some individuals or units may not be observed for the full time to failure. In some circumstances, some individuals or units do not fail but are lost-to-followup during the observed period. Instead of knowing the failure time t, all we know about these individuals is that their time-to-failure exceeds some value, say x, where x is the follow-up time of these individuals in the study, which is referred to as censoring.
Under random or noninformative censoring, a sample of, say, n, elements are followed for some time, say T. An instance of this type of censoring occurs when the termination date for a medical trial is not fixed before the study starts but is rather chosen later, where the choice is influenced by the results of the study up to the termination time. In a straightforward overview of this scheme, which can be considered as time censoring, each element has a maximum inspection time of, say, T i, for i = 1,…, n which may possibly vary from one situation to another. Consider an experiment where we start with an observation of 50 cancer patients and terminate the experiment after a certain amount of time irrespective of the number of patients that have died or survived at the specified time. The survival of the patients may be due to withdrawal, inadequate monitoring mechanism, or deaths, which is not related to the purpose of the study.
Maximum likelihood estimator (MLE) is very popular both in the literature and in practice. Some researches have been done to compare MLE and the Bayesian approach in estimating the two parameters of the generalised distribution using hybrid and complete failure time data. Amongst them are , who studied generalized exponential distribution: Bayesian estimations. Reference  considered generalized exponential distribution by applying a different method of estimations. Other estimation procedures related to the above were considered by . Reference  determined the Bayes estimates of the reliability function and the hazard rate of the Weibull failure time distribution by employing squared error loss function. Reference  studied Bayesian parameter and reliability estimate of Weibull failure time distribution; reference  studied the approximate Bayesian estimates for the Weibull reliability function and hazard rate from censored data by employing a new method that has the potential of reducing the number of terms in Lindley procedure. See also [8, 9]. Reference  studied Bayes estimators of modified Weibull distribution parameters using Lindley's approximation.
The main aim of this paper is to compare the classical maximum likelihood estimator to the proposed Bayesian estimators with two loss functions for the unknown parameters of the generalised exponential distribution for different sample sizes and parameter values.
Let (t 1,…, t n) be the set of n random lifetimes with respect to the generalised exponential distribution with p and θ as the parameters, where θ is the scale parameter and p the shape parameter.
In random censoring as stated by , we assume t i = min (T i, C i), δ = 1 if T i ≤ C i and δ = 0 if T i > C i. The observed data from n individuals is assumed to consist of the pair (t i, δ i), i = 1,…, n, so that the final result obtained will be the same provided C i is available for all i.
It is therefore assumed that TC; that is, T and C are independent of each other, which implies that the censoring time C is noninformative in analysing the failure time T. In order for this assumption to be valid, one has to ensure that the loss to follow-up of individuals is not as a result of the failure time defined. The likelihood function with respect to random censored data is
where S(·) is the survival function. Calculation of the maximum likelihood estimator often requires that some iterative (e.g., Newton-Raphson) procedures be implemented to obtain the parameters estimates. This can simply be obtained in any statistical software.
In this section we consider the Bayes estimation of the two unknown parameters. Since both parameters are assumed to be greater than zero (0), we let both take on the following gamma prior distributions:
Assume that the hyperparameters a, b, c, and d are known and >0. The joint density function of the data, θ and p, can be obtained as
Bayesian inference is based on the posterior distribution which is given as
The ratio of the two integrals given in (5) cannot be obtained in a closed form. We can apply a numerical integration technique, which may be computationally intensive especially in high-dimensional parameter space. It is also possible to make use of numerical approximation methods such as  and/or . In this paper, we shall consider both methods for this type of censoring scheme and for this distribution, since we are unaware of any study employing both methods for this distribution and with this type of censored data apart from the former by  with uncensored data. This approach is considered under two loss functions, namely, LINEX and squared error loss.
Reference  proposed a procedure to approximate the ratio of integrals. This approach has been used by several authors like [3, 6] to obtain the approximate Bayes estimators. Reference  shows the approximate procedure for evaluating ratio of integrals of the form
where α = (α 1, α 2,…, α m) and (α) is the logarithm of the likelihood function and ω(α), v(α) are arbitrary functions of α. Assume that v(α) is the prior distribution for α and ω(α) = u(α) · v(α) with u(α) being some function of interest. The posterior expectation of q(α) is given as
Considering the Bayesian estimator under the squared error loss function, which is the posterior mean, the following can be obtained where u 1 andu 11 are the first and second derivatives of the scale parameter (θ) while u 2 and u 22 are also the first and second derivatives of (p):
Refer to appendix section for derivatives with respect to the shape and scale parameters.
Unlike the symmetric loss function (squared error), this loss function measures the degree of underestimation and overestimation of the estimated parameter.
The Bayes estimator of, say, α, which is denoted by under LINEX loss function, is
provided that E α[exp (−kα)] exists and is finite.
The Bayes estimator of a function u BL = u[exp (−kθ), exp (−kp)] under LINEX is given as
Observing from Section 3.1, it is clear that the Lindley approach demands or requires that we evaluate the third derivatives of the likelihood function. Depending on the distribution and the number of parameters involved, this approach can be very difficult to achieve. Tierney and Kadane through Laplace approximation procedure gave an alternative to the Lindley approach which only requires the first and second derivatives of the likelihood function. Let L(α; t) be the likelihood function of α based on n number of observations. π(α) represents the prior distribution defined over the parameter space, v(α) represents the loss function, and q(αt) represents the posterior distribution of α. The Bayes estimate of a function q(α) under the squared error loss function is the posterior mean and is given as
Equation (13) can be approximated in the form
This can similarly be expressed as
where and maximize *(α *) and (α), respectively, and ∑* and ∑ are the negatives of the inverse Hessians of * and , respectively.
The matrix ∑ takes the form
We can similarly obtain the expression for the matrix ∑*, which involves the partial derivatives of *. In applying the method the following need to be maximised:
Setting /θ and /p to zero produces the following system of equations:
where LogL(θ, p; t i, δ i)/θ and LogL(θ, p; t i, δ i)/p are easy to obtain. Refer to Appendix section for 2 /θ 2, 2 /p 2, and 2 /θp.
The Bayesian estimator of a function u BL = u[exp (−kθ), exp (−kp)] under LINEX with respect to Tierney and Kadane procedure is given as
The same approach is also adopted with the squared error loss function to obtain the Bayes estimates of the unknown parameters.
The data for this example are on survival of patients with cervical cancer, recruited to a randomised trial aimed at analysing the effect of addition of a radio sensitiser to radiotherapy (new therapy—“treatment B”) compared to using only radiotherapy (control—“treatment A”). Treatment A was given to 16 and treatment B to 14 patients. The data are in days since the start of the study; the event of interest is death caused by this cancer. Our main interest is on the patients under treatment A, which is fairly small to illustrate the proposed methods in this paper. The data is obtained from . Starred observations are censored: 90, 890*, 142, 1037, 150, 1090*, 269, 1113*, 291, 1153, 468*, 1297, 680, 1429, 837, 1577*. The results are depicted in Table 3.
The following data which are considered large are obtained from . The data represent survival times for 121 breast cancer patients who were treated over the period 1929–1938. Times are in months and asterisks denote censoring times: 0.3, 0.3*, 4.0*, 5.0, 5.6, 6.2, 6.3, 6.6, 6.8, 7.4*, 7.5, 8.4, 8.4, 10.3, 11.0, 11.8, 12.2, 12.3, 13.5, 14.4, 14.4, 14.8, 15.5*, 15.7, 16.2, 16.3, 16.5, 16.8, 17.2, 17.3, 17.5, 17.9, 19.8, 20.4, 20.9, 21.0, 21.0, 21.1, 23.0, 23.4*, 23.6, 24.0, 24.0, 27.9, 28.2, 29.1, 30, 31, 31, 32, 35, 35, 37*, 37*, 37*, 38, 38*, 38*, 39*, 39*, 40, 40*, 40*, 41, 41, 41*, 42, 43*, 43*, 43*, 44, 45*, 45*, 46*, 46*, 47*, 48, 49*, 51, 51, 51*, 52, 54, 55*, 56, 57*, 58*, 59*, 60, 60*, 60*, 61*, 62*, 65*, 65*, 67*, 67*, 68*, 69*, 78, 80, 83*, 88*, 89, 90, 93*, 96*, 103*, 105*, 109*, 109*, 111*, 115*, 117*, 125*, 126, 127*, 129*, 129*, 139*, 154*.
Since it is difficult to compare the performance of the proposed methods theoretically, we have performed an extensive simulation to compare the estimators through mean squared errors and absolute biases by employing different sample sizes with different parameter values. We considered a sample size of n = 25, 50, and 100. The following steps were employed to generate the data. The generation of GE(θ, p) is simple as stated in . If U follows a uniform distribution in the interval [0,1], then Y = (−ln (1 − U (1/p))/θ) follows GE(θ, p). Consequently, with a very good uniform random number generator, the generation of GE(θ, p) random deviate is immediate.
A lifetime T is generated from the sample sizes indicated above from the GE(θ, p) distribution which represent failure of the product. The values of the assumed actual shape parameter (p) of the GE(θ, p) distribution were taken to be 0.8, 1.2 and 2.0. The scale parameter (θ) was considered throughout to be 1 without loss of generality. The same sample size is generated from the uniform distribution for the censored time C with (0, b), where the value of b depends solely on the proportion of the observations that are censored. In our study, we considered the percentage of censoring to be 25. t i = min (T i, C i) is taken as the minimum of the failure time and that of the censored time of the observed time T. To compute the Bayes estimates, an assumption is made such that θ and p take, respectively, Gamma(a, b) and Gamma(c, d) priors. We set the hyperparameters to 0; that is, a = b = c = d = 0; this makes the priors noninformative. The values of the loss parameter for the LINEX loss function are k = ±0.7. These were iterated 1000 times. The mean squared errors and the absolute biases are determined and presented for the purpose of comparison.
The main objective of this study is to obtain the estimates of the generalised exponential distribution parameters and compare the proposed methods applied in this paper. In order to examine the estimates of the parameters which cannot be obtained analytically, we made use of different numerical approximation procedures and have obtained absolute biases and mean squared errors of the estimated parameters.
Observing from Table 1 and Figures Figures11 and and2,2, it is evident that the smallest mean squared errors vis--vis the absolute biases for the estimated scale parameter occurred under the Bayesian estimator with the linear exponential loss function. The loss parameter from which we obtained the smallest mean squared errors is 0.7, which is above zero, implying this approach is preferred if overestimation is more serious than underestimation. This occurred largely with the Lindley numerical approximation procedure, followed by Tierney and Kadane. As the sample size increases, all the estimators' mean squared errors correspondingly decrease. Another observation made that needs to be mentioned is that Lindley approximation method under the squared error loss function performed better than that of the Tierney and Kadane with respect to the generalised exponential scale parameter. As illustrated clearly in Figure 2, both Tierney and Kadane had equal minimum absolute biases.
Considering Table 2 alongside Figures Figures33 and and4,4, which contain the mean squared errors and the absolute biases of the estimated shape parameter , we noticed that the Bayesian estimator under the Tierney and Kadane method performed better than the Lindley approach but maximum likelihood estimator overall had the smallest mean squared error followed by Tierney and Kadane. The minimum absolute bias occurred predominantly with the Tierney and Kadane approach. The bold numbers indicate the smallest and minimum biases of the estimated parameters with their corresponding estimators. The Bayes estimator with Lindley numerical approximation procedure for the exponential distribution performed better under the squared error loss function for the shape parameter than that of the Tierney and Kadane numerical method to a very large extent. All the estimators' mean squared errors got closer as the sample size increased.
The Bayesian estimator under linear exponential loss function with the positive loss parameter has the smallest standard error as illustrated in Table 3. This happened with the approximation procedure suggested by Tierney and Kadane; it implies that linear exponential loss function overestimates the scale and shape parameters of the generalised exponential distribution. From this example, where the sample size is considered to be fairly small, we noticed that using the Tierney and Kadane approach via Bayes under squared error loss performs fairly better than that of the Lindley method as well as the maximum likelihood estimator.
Using the iterative procedure suggested in this paper for both MLEs and Bayes with respect to data 2, the MLEs of and are 0.765027 and 6.277847 with their corresponding standard errors as 0.006323 and 0.010377. Since we do not have any prior information on the hyperparameters, we assume a = b = c = d = 0. This makes the priors on and noninformative. For computing the Bayes estimators, we considered the squared error loss and linear exponential loss functions and gamma priors on both α and β same as the approach used in the simulation section. After computing the Bayes estimators via Lindley approximation procedure under squared error loss for and , the following parameters estimates and standard errors were obtained, respectively, 0.765027, 6.277847 and 0.006325, 0.010376.
Computing the Bayes estimates of and and their corresponding standard errors under the linear exponential loss function with a loss parameter of 0.7, we have 0.765029, 6.277860 and 0.006323, 0.010377. With the loss parameter being −0.7, we have 0.765025, 6.277840 and 0.006323, 0.010376, respectively. Considering 95% confidence interval of MLE, we have = (0.752634, 0.777419) and = (6.257508, 6.298185). Bayes credible intervals under squared error loss function of and are 0.752634, 0.777419 and 6.257508, 6.298185, respectively. The Bayes credible intervals with respect to the LINEX loss function with the loss parameter 0.7 for and are 0.752637, 0.777421 and 6.257521, 6.298198 and those of the −0.7 are 0.752633, 0.777417 and 6.257501, 6.298178, respectively.
Computing the Bayes estimators using Tierney and Kadane (T & K) approximation procedure under squared error loss function for and , we have, respectively, the following parameters estimates and standard errors: 0.764725, 6.275374 and 0.006320, 0.010373. Calculating the Bayes estimates via Tierney and Kadane of and with their corresponding standard errors under the linear exponential loss function with a loss parameter of +0.7, we have 0.765633, 6.282807 and 0.006328, 0.010385. With the loss parameter of −0.7, we have 0.763671, 6.277809 and 0.006311, 0.010358, respectively. Bayes credible intervals using Tierney and Kadane under squared error loss function of and are 0.752338, 0.777113 and 6.255044, 6.295704. The Bayes credible intervals with respect to the LINEX loss function with the loss parameter +0.7 for and are 0.753231, 0.778035 and 6.262453, 6.303161 and those of −0.7 are 0.751301, 0.776041 and 6.246441, 6.287045, respectively.
As clearly stipulated above, the estimator with the smallest standard error is Bayesian under the linear exponential loss function for both the scale and shape parameters. This happened under the Tierney and Kadane numerical approximation procedure. This is followed by Bayes estimator using the squared error loss function, again with the Tierney and Kadane method. We observed that the linear exponential loss function had the narrowest credible intervals with respect to the Tierney and Kadane approach as compared to the credible intervals of Bayes using Lindley and the confidence intervals obtained from maximum likelihood estimator. This happened with a negative loss parameter, an indication of underestimation of the generalised exponential distribution parameters.
From the results and discussions above it is evident that the Bayesian estimator under linear exponential loss function performed quite better well than Bayes under squared error loss function and maximum likelihood estimator for estimating both the scale parameter and shape parameter , with both MSE and absolute bias. Lindley method performed better than T & K for the scale parameter with regard to mean squared errors while T & K performed better for the shape parameter with both the mean squared errors and the absolute bias. Considering the standard errors obtained for the real data analysis, we can state that the T & K method outperformed the Lindley numerical approximation and the maximum likelihood estimator.
Let the following assumptions hold under the Lindley approach:
Note that 20 and 30 are the second and third derivatives for the scale parameter of the log-likelihood function while 02 and 03 are the derivatives of the shape parameter.
The authors declare that there is no conflict of interests regarding the publication of this paper.