Tuesday, 22 September 2009

Estimating stroke-free and total life expectancy in the presence of non-ignorable missing values

Van den Hout and Matthews have a new paper in JRSS A. This deals with interval censored data from a three-state disease model where subjects may miss scheduled interviews meaning the disease status is not observed. A joint model for the disease state and an observation indicator is developed, being a continuous time generalisation of Cole et al (2005) that allows information from exact death times to be included. Conditional on the disease state and measured covariates, the observation indicator is governed by a logistic model.
In general the method seems promising for dealing with informative observation when the potential observation times are known.

Though not noted, the model can be expressed as a hidden Markov model. The authors state that the logistic model and the three-state Markov model are estimated separately. It is not made clear how this is achieved since the logistic model depends on the unobserved states of the Markov model. In some cases the missing state will in fact be known, for instance if the sequence is 1,-,1 or 2,-,2. However, for sequences like 1,-,2 or 1,-,3 it is not possible to establish the unobserved state.

The main aim of the analysis is to obtain estimates of life expectancy, disease free life expectancy and post-disease life-expectancy. These are complicated functions of the parameter vector as they involve integrals of transition probabilities. In addition to the method of Aalen et al 1997, Van den Hout and Matthews additionally propose to use a Metropolis algorithm to get confidence intervals for the life expectancies. The resulting intervals have a Bayesian interpretation, being the credible intervals from an improper uniform prior, but will not be invariant to changes in parametrisation. In practice, the intervals may give good frequentist coverage, particularly for large samples. However, Van den Hout and Matthews seem to be implying the intervals have exact coverage (apart from Monte-Carlo error through the Metropolis algorithm) which is a substantial misconception. Moreover, no mention of the procedure being Bayesian is given.