## Friday, 23 November 2012

### Ties between event times and jump times in the Cox model

Xin, Horrocks and Darlington have a new paper in Statistics in Medicine. This considers approaches for dealing with ties in Cox proportional hazard models, not between event times but between event times and changes to time dependent covariates.

If a change in a time-dependent covariate coincides with a failure time there is ambiguity over which value of the time dependent covariate, z(t+) or z(t-), should be taken for the risk set at time t. By convention, it is usually assumed that z(t-) should be taken, i.e. that the change in the covariate occurs after the failure time. The authors demonstrate that for small sample sizes and/or a large proportion of ties, the estimates can be sensitive to the convention chosen. The authors also only consider cases where z(t) is a binary indicator that jumps from 0 to 1 at some point and cannot make the reverse jump. Obviously this will magnify the potential for bias because the "change after" convention will always underestimate the true risk whereas the "change before" will always overestimate the true risk.

The authors consider some simple adjustments for the problem: compute the "change before" and "change after" estimates and take their average or use random jittering. A problem with the averaging approach is estimating the standard error of the resulting estimator. An upper bound can be obtained by assuming the two estimators have perfect correlation. The jittering estimator obviously has the problem that different random jitters will give different results, though in principle the jittering could be repeated multiple times and combined in a fashion akin to multiple imputation.

It is surprising that the further option of adopting an method akin to the Efron method for ties. Essentially at each failure time there is an associated risk set. It could be argued that every tied covariate jump time had a 50% chance of occurring before or after the failure time. The expected contribution from a particular risk set could then be $\sum_{r \in \mathcal{R}} 0.5 \exp\{\beta z_r(t+)\} + 0.5\exp\{\beta z_r(t-)\}$
It should also be possible to apply this approach using standard software, e.g. coxph() in R. It is simply necessary to replace any (start,stop) interval that ends with a tied "stop" with two intervals (start, stop - 0.00001) and (start, stop + 0.00001) each of which are associated with a weight of 0.5.