Wednesday 7 December 2011

Intermittent observation of time-dependent explanatory variables: a multistate modelling approach

Brian Tom and Vern Farewell have a new paper in Statistics in Medicine. This considers the problem of estimating the effect of a time dependent covariate on a multi-state process when both the disease process of interest and the time dependent covariate are only intermittently observed. The most common existing approach to dealing with this problem is to assume that the time dependent covariate is constant between observations, taking the last observed value. The authors instead jointly model the two processes as an expanded multi-state model, if the disease process had n states and the covariate process m states the resulting process will have states. An additional assumption, that movements in the covariate process are not directly affected by the state of the disease process, is also made.

A simulation study is performed which shows that the approach of assuming the time dependent covariate is constant leads to biased estimates, particularly when there is a bias in the trend of the covariate process (e.g. much more likely to decrease than increase in value).

The overall approach taken by the authors is to model their way out of difficulty. They assume that both the disease process of interest and the covariate process are jointly time homogeneous Markov and the validity of the results will depend on these assumptions being correct. As noted by the authors, if the covariate can take more than a small number of values the approach becomes unattractive because of the large number of nuisance parameters required. A point not really emphasized, but related to the analogous approach taken by Cortese and Andersen for continuously observed competing risks data (bizarrely not referenced in this paper despite massive relevance!), is that having modelled the time dependent covariate, the model can then be used to make overall predictions.

One could argue that the convention of following forward the covariate value observed from the previous period is a way of allowing a prediction to be made about the trajectory in the next period. A fairer comparison in some cases might therefore be to look at the bias in estimating the transition probabilities to time given a covariate variate value of at time . While we would expect these estimates still to be biased, the amount of bias is likely to be less than found by looking at the regression coefficients directly.

An open problem seems to be the development of methods that do not require strong assumptions, or else are robust to misspecification, to deal with intermittently observed time dependent covariates.

No comments: