## Friday, 7 January 2011

### Parametric Estimation of Cumulative Incidence Functions for Interval Censored Competing Risks Data

Peter Nyangweso[1], Michael Hudgens and Jason Fine have a new paper currently available as a University of North Carolina at Chapel Hill Department of Biostatistics Technical Report. This considers parametric modelling of cumulative incidence functions in the case of interval censored competing risks data. It is somewhat curious that while there has been quite a lot of work on interval-censored competing risks by modelling non-parametrically, little seems to have been done regarding the (presumably easier) problem of parametric modelling (although lots have work has been done for more general multi-state models).

Nyangweso and colleagues consider a parameterisation based on the cumulative incidence functions. This makes sense since the interval-censored likelihood can be directly written in terms of the CIFs. They consider a Gompertz model for each CIF taking the form:

$F_{k}(t;\Theta_k) = 1 - \exp[\beta_k \{1-\exp(\alpha_k t)\}/\alpha_k]$
Note that this allows for a proportion $\inline 1 - \exp(\beta_k/\alpha_k)$ who will never experience event k. An obvious complication is that a valid set of CIFs needs to have the property that $\inline \sum_{k}^{n_k} F_k(t;\Theta_k) \leq 1$. It is noted that in practice the unconstrained MLE will be such that the sum of the CIFs will be less than 1 at the greatest time at which someone was right censored as otherwise the likelihood involves taking the log of a non-positive number. Some discussion of constrained optimization, where $\inline \lim_{t \rightarrow \infty }\sum_{k}^{n_k} F_k(t;\Theta_k) = 1$, so that it is assumed that all patients must eventually experience on of the events (i.e. no cure fraction).

In addition to full likelihood estimation, the parametric analogue of the naive estimator investigated in Jewell et al (2003, Biometrika) is also considered. This just involves modelling each CIF separately as if it were univariate interval censored survival data. Obviously there is an even greater risk of CIFs summing to more than 1 for the naive estimates.

The paper does not get as far as considering models for covariates. The main complication with adding covariates to the Gompertz model seems to be that we would then be almost guaranteed to have covariate patterns where the CIFs sum to more than 1 even at times of interest. There is surely advantage in modelling the cause specific hazards as this way the CIFs are guaranteed to be valid at all times for all covariate values. While for most parametric hazards computation of the CIF requires numerical integration, the extra computation required shouldn't be prohibitive.

Update: The original paper appears to have been superseded by a new version with a slightly different set of authors (Hudgens, Li & Fine).