Wednesday 23 February 2011

Estimating and testing for center effects in competing risks

Sandrine Katsahian and Christian Boudreau have a new paper in Statistics in Medicine. This develops methods for including frailty terms within a Fine-Gray competing risks model in order to account for clustering, e.g. effects of different centres.

Since the Fine-Gray model is essentially just a standard Cox proportional hazards regression model with additional time dependent weights, based on the censoring distribution, for individuals who have had a competing event, methods appropriate for standard Cox frailty models can be readily adapted.

Katsahian and Boudreau closely follow the approach taken by Ripatti and Palmgren (Biometrics, 2000). They assume a Gaussian frailty. Computation of the likelihood requires integrating out the frailty terms. Here this is performed using a Laplace approximation. A difficulty with the Laplace approximation is that it still requires the modal value of the frailty distribution conditional on the data and current values of the parameters. The authors therefore take a profile likelihood approach in which they fix the frailty variance and maximize the likelihood term with respect to both the regression parameters and the frailty terms, . Having obtained, and they can then plug into the Laplace approximation to get the profile likelihood for . The procedure gives a local approximation for which can be used to suggest the updated estimate. Thus the process involves alternating between two Newton-Raphson algorithms until convergence.

Monday 21 February 2011

Testing Markovianity in the Three-state Progressive Model via future-past Association

Mar Rodríguez-Girondo and Jacobo de Uña Álvarez have a paper currently available as a Universidade de Vigo Discussion paper in Statistics and Operations Research. They develop a test for the Markov property in a progressive three-state model subject to continuous observation up to right-censoring. The test is based on calculating Kendall's Tau at each time point t, which involves calculating the difference between the concordance and discordance probabilities for two pairs of (Z,T) where Z=sojourn in state 1, T=time to entry in state 3, given both subjects are in state 2 at time t, i.e. Z <= t < T. If the Markov property holds we expect Kendall's Tau to stay at around zero. For a non-Markov process tau would vary with time away from 0. An estimator for Tau is developed and a bootstrap resampling algorithm is proposed to estimate a p-value for tau at a fixed time point t, based on independently sampling Z and T from their empirical distributions. A trace of p-values at a grid of time points can then be produced.

Unlike the Cox-proportional hazards approach where a single statistic is produced, here the p-value varies depending on the choice of t chosen. A superior power to the Cox-PH approach was obtained in simulations but only for a good choice of t. An omnibus statistic based on some weighted integral of the absolute value (or square) of tau over the observation range would be useful.

The paper only deals with the progressive 3-state case. It's unclear how the method would be extended to incorporate complicated past history in a more general multi-state model. However, there is scope to extend it to testing whether a particular state within a model has semi-Markov dependencies. In its current form, while the test is an interesting concept, using a simple Cox-PH seems a much more attractive prospect in practice. Update: This paper is now published in Biometrical Journal.

A Hidden Markov Model for Informative Dropout in Longitudinal Response Data with Crisis States

Spagnoli, Henderson, Boys and Houwing-Duistermaat have a new paper in Statistics and Probability Letters. The paper is concerned with the modelling of discrete-time continuous response longitudinal data in the presence of informative dropout. The process of dropout is modelled by a three-state (potentially non-homogeneous) discrete time Markov model. The three states correspond to stable, crisis and dropout. The crisis state is characterized by a higher probability of
dropout and a shift in the mean response. The shift is random but is fixed for each individual so that repeated visits to the crisis state have a cumulative effect on the mean. The model is motivated by studies in which dropout is more likely to occur after the treatment has been ineffective for a period of time. A linear-mixed model, with random slope and intercept, is taken for the responses, with the response at time m being shifted by a random quantity, d, times by the number of time periods spent in the crisis state.

The authors put the model within the standard Rubin framework of MCAR/MAR/MNAR, making a distinction between observable and latent filtration. They are careful to formulate the model so that the latent mechanism for dropout only depends on the past and not the future.

The model is applied to schizophrenia data and the Leiden 85+ data. In the former case, interest is in the mean response curves in the hypothetical situation of no dropout. However for the Leiden 85+ data dropout is death and thus such curves would have little meaning. For both cases the crisis-state model represents a substantial improvement in likelihood compared to a two-state model.

Thursday 17 February 2011

Accelerated failure time regression for backward recurrence times and current durations

Keiding, Fine, Hansen and Slama have a new paper in Statistics & Probability Letters. This considers regression models for time to event data in which only cross-sectional data is available. In this situation the process is assumed to be a stationary renewal process. The observed times are then taken to be backward recurrence times (i.e. time since last renewal to a given observation time). Here it is noted that when the inter-arrival times f(x) are subject to an accelerated failure time model, the same accelerated failure time model will apply to the backward recurrence times (a result apparently first found by Yamaguchi in the social science literature). As a consequence, an AFT model can be fitted to the observed backward recurrence times to give estimates of the AFT model for the inter-arrival times.

Keiding et al consider modelling time-to-pregnancy using data on the current duration spent attempting to become pregnant by modelling the backward recurrence times as Pareto or generalised Gamma distributed within an accelerated failure time model with frequency of sexual intercourse as a covariate in the AFT model.

The equivalence of backward and forward recurrence times (the latter being the time to next event given observation from some fixed time) means that the same approach could be applied to prevalent cohort studies with an unknown initiation time.

Saturday 12 February 2011

Flexible Nonhomogeneous Markov models for panel observed data

Andrew Titman has a new paper in Biometrics. This develops an approach to fitting non-homogeneous Markov models to panel data by use of direct numerical solution of the Kolmogorov forward equations. Existing methods to non-homogeneity have concentrated on special cases where the forward equations have matrix analytic solutions (i.e. piecewise constant intensities or time transformation models), although numerical solutions have been used in Bayesian analyses mainly to accommodate use in WinBUGS (see e.g. Welton and Ades or Pan and Chen). The approach is clearly somewhat more computationally intensive than matrix analytic methods. However, a couple of computational tricks are used to improve the situation. In particular a Fisher scoring algorithm is maintained by solving an extended system of ODEs incorporating the first derivatives of the transition probabilities with respect to the model parameters. The real point at which the method struggles is when there are continuous covariates because a separate ODE must be integrated for each covariate value in the data. For large datasets an exact approach becomes untenable. An approximate method is proposed in these situations where a clustering algorithm is used to reduce the number of unique covariate patterns and then each patient is assumed to have covariate pattern equal to the mean value within their cluster. This approach gives pretty close results to the exact method for 10 clusters and is even better for 50 or 100 covariate values where the method is often still practical.

B-spline functions for the transition intensities based on a known set of knot points are proposed. This gives a model which can viewed both as a generalization of the flexible time transformation approach of Hubbard et al, and also as a smooth alternative to piecewise constant intensities. One downside of the great flexibility is that it is quickly possible to run into models with identifiability problems and singular Fisher information matrices. Titman proposes to limit the spline to a maximum time and assume constant intensities beyond this range. Similarly, he suggests there often wont be enough data to allow inhomogeneity on all transition intensities and the method is thus most useful when one or two intensities are of most interest. For the CAV data analyzed, disease onset is of most importance and the method performs better at picking up the increasing hazard than a time transformation model that requires the inhomogeneity to be proportional between intensities.

As part of the supplementary materials some fairly general R code, working in conjunction with the R package deSolve for the ODE solver, is provided. This is flexible in allowing user defined forms for the generator matrix, but there is no easy interface for someone to specify they want a 5-state model with a certain set of allowable intensities and inhomogeneity in a particular set of states, in the same way as for piecewise constant intensities in msm say.