Tuesday, 17 August 2010

A novel semiparametric regression method for interval-censored data




Seungbong Han, Adin-Cristian Andrei and Kam-Wah Tsui have a paper currently available as a University of Wisconsin Biostatistics and Medical Informatics Department working paper. This essentially extends the concept of pseudo-observations, which up till now have only concerned right-censored, to interval censored survival data. The idea remains the same except that S(t) is estimated using the NPMLE of the survival function (e.g. via Turnbull self-consistent estimator or iterative convex minorant).

The paper is a little disappointing in only providing a very brief heuristic justification for using pseudo-observations in the interval-censoring case. For right-censored data, an estimate of the baseline survival function can be obtained as well as the regression parameters. No discussion of whether this is possible for the interval censored case is given. However, the baseline estimates are likely to be highly unreliable (e.g. non-monotonic) because particular subjects may have extreme influence because they effect where the mass points of the NPMLE occur. For example, the plot above is based on 1000 subjects with survival generated from an exponential with rate 0.25, subject to independent current status observation (uniformly on (0,10)). Estimating the baseline survival from the pseudo-observations (calcuated at times 1,2,3,...,10) leads to a survivor function which increases at time 5. It seems necessary that there should be more consideration of this issue as well as the choice of how many time points to evaluate the pseudo-observations at.

The authors choose to transform the pseudo-observations before regressing on the covariates, rather than using a link function in a GLM. One problem with this approach is presumably that if the estimate of S(t) is 0 or 1, g(S(t))=-Inf or Inf.

On a practical level the authors use Icens to calculate the NPMLE. As noted previously the MLEcens package seems to perform considerably better than Icens and would presumably speed up computation of the pseudo-observations method.

A natural next step would be to consider pseudo-observations for interval censored multi-state data. The lack of non-parametric methods except in a few simple cases is an obvious bar to development in this direction.

Update: The paper has now been published in Communications in Statistics - Simulation and Computation.

Thursday, 12 August 2010

Two Pitfalls in Survival Analyses of Time-Dependent Exposure: A Case Study in a Cohort of Oscar Nominees

Wolkewitz, Allignol, Schumacher and Beyersmann have a new paper in The American Statistican. This uses data on the survival outcomes of Oscar nominees as a nice illustrative example of the possibility of length bias and time dependent bias in survival analysis problems. This was originally considered by Redelmeier and Singh in an Annals of Internal Medicine paper where it is was claimed that winning an Oscar significantly increased survival prospects. Possible pitfalls of such an analysis could be assuming everyone is at risk of death from birth rather than when they enter the study (e.g. at nomination), assuming people who win an Oscar have the hazard relating to the win from the time of first nomination rather than the time they actually won, and so forth. The authors show the correct model, as well as a series of possible incorrect models, in terms of multi-state models.