Monday, 21 December 2009

Estimating life expectancy of demented and institutionalized subjects from interval-censored observations of a multi-state model

Joly, Durand, Helmer and Commenges have a new paper in Statistical Modelling. This concerns the estimation of life expectancy in patients with dementia. Unlike similar analysis by Van den Hout and Matthews, here there are 5 states rather than 3 since they consider instiutionalization as an additional event. Age at death is known up to right-censoring but other transitions are interval censored. Like previous papers by Joly and Commenges, a penalized likelihood approach is taken. The penalized likelihood is approximated by cubic M-splines, with the degree of penalization chosen via an approximate cross-validation score. This allows smooth, flexible non-homogeneous intensities. A Markov assumption is assumed but semi-Markov models can also be fitted provided the model is progressive.

Life expectancies are found by integrating the estimated transition probabilities. Like Van den Hout and Matthews, a parametric bootstrap approach based on simulating from the asymptotic normal distribution of the parameters is used to obtain confidence bands. However, these will typically underestimate variability because the penalization factor is taken to be fixed.

Tuesday, 15 December 2009

Patient death as a censoring event or competing risk event in models of nursing home placement

Szychowski et al have a new paper in Statistics in Medicine. This looks at competing risks data where the event of interest is placement in a nursing home with death the other competing event. They compare the classical cause-specific-hazards regression approach with that of the Fine-Gray proportional subdistribution hazards model. The data were from a RCT on the effectiveness of an enhanced counseling and support invervention. The estimate of the effect of the intervention was similar in both cases (some evidence of a benefit in delaying admission to a nursing home). The authors attribute the similarity to a lack of a significant effect of the intervention on CSH of death. They recommend that both CSH and proportional subdistribution hazards approaches to covariate effect modelling should be considered.

Tuesday, 24 November 2009

Statistical Analysis of Illness-Death Processes and Semicompeting Risks Data

Xu, Kalbfleisch and Tai have a new paper in Biometrics. They make a compelling case against a latent failure times approach to the analysis of semi-competing risks data, advocating to instead take a classical cause-specific hazards approach. Moreover, they note that semi-competing risks is essentially just an illness-death model. They consider models where the intensities have a shared Gamma frailty and present methods for (Non-parametric) maximum likelihood estimation. In addition, covariates can be included acting proportionally on the conditional hazards (an alternative approach with proportionality on the marginal hazards is also outlined).

Monday, 23 November 2009

Semi-Markov models with phase-type sojourn distributions

Titman and Sharples have a new paper in Biometrics. This concerns the fitting of semi-Markov models to panel observed data. They propose to fit models where the sojourn time in each state has a phase-type sojourn distribution - i.e. corresponds to the time to absorption of some time homogeneous Markov model. The advantage of this specification is that, unlike general semi-Markov models, the likelihood remains analytically tractable, falling within a hidden Markov model framework. This also makes the extension to models where the observations are subject to misclassification error straightforward, at least theoretically. A two-phase Coxian phase-type distribution is proposed for the sojourn time, allowing increasing, decreasing or constant hazards with respect to time since entry into the state.

While the phase-type framework makes computation of the likelihood more straightforward, model fitting is still potentially problematic due to possible problems of parameter estimability. Also since certain parameters of the phase-type model are unidentifiable under a Markov model meaning an (approximate) modified likelihood ratio test is required to test the Markov assumption.

Wednesday, 11 November 2009

Mstate: Data preparation, estimation and prediction in multi-state models. R package.

A significant barrier to the widespread use of multi-state models in applied statistics has been the lack of software. For right-censored data, models on the transition intensities can be fitted straightforwardly using standard survival modelling techniques (e.g. Cox regression and Nelson-Aalen estimators). However, for estimates of cumulative incidence functions, state occupation probabilities and moreover their standard errors, with a few exceptions it was generally necessary to make your own code. Hein Putter, Marta Fiocco and Liesbeth de Wreede have created the R package mstate, this provides a general framework for fitting right-censored and left-truncated non-parametric and semi-parametric multi-state models. The package exploits the existing R package survival to fit the models to intensities but also provides routines to calculate transition probabilities and their standard errors of the overall multi-state model. This is clearly a very useful tool. One small drawback of the package is that the routines such as those to calculate the transition probabilities appear to be coded entirely in R. As a result computation is the not as fast as might be hoped. The package etm by Arthur Allignol, which only computes the Aalen-Johansen estimator, may be preferable in terms of speed when only a non-parametric model is required as this incorporates some C code.

Update: An article on mstate in Computer Methods and Programs in Biomedicine is now available.

Further Update: A further paper on mstate is now available in the Journal of Statistical Software.

Computation of the asymptotic null distribution of goodness-of-fit tests for multi-state models

Andrew Titman has a new paper in Lifetime Data Analysis. This is essentially a continuation of previous papers by Aguirre-Hernandez and Farewell and by Titman and Sharples on Pearson-type goodness-of-fit tests for Markov and hidden Markov models on panel observed data. A practical problem with the tests is that the null distribution depends on the true parameter value and the observation scheme and that a chi-squared approximation can perform inadequately. A parametric bootstrap could be used to find the upper 95% point of the distribution. However, for many models the re-fitting required may take an unacceptable amount of time. Titman shows that, conditional on a fixed observation scheme, the asymptotic distribution can be expressed as a weighted sum of independent random variables, where the weights depend on the true parameter values. A simulation study shows that computing the weights based on the maximum likelihood estimate of the parameter values, gives tests of close to the appropriate size for realistic sample sizes. The method can be applied to both Markov and misclassification-type hidden Markov models, but only when all transitions are interval-censored.

Thursday, 5 November 2009

Analyzing longitudinal data with patients in different disease states during follow-up and death as final state

Le Cessie, de Vries, Buijs and Post have a new paper in Statistics in Medicine. This is concerned with estimating mean quality of life in breast cancer patients at different time points. Standard approaches to analyzing such longitudinal data would be generalized estimating equations (GEE). However, observations are often missing and assuming such data are missing completely at random (MCAR) is unrealistic or even missing at random. In the current study a three-state progressive illness-death model is considered where the illness state refers to presence of a relapse. Both Markov (or clock-forward) and semi-Markov (or clock-reset) models are considered. There was continuous observation of the illness-death process, whereas the quality of life was observed at a common set of time points. The authors propose to model quality of life scores conditional on the state occupied in the multi-state model. A more realistic missingness model can then be adopted by assuming MAR conditional on the occupied state. Inverse probability weighting is used to deal with the missing data. Standard error estimation is performed by bootstrapping.

While the model gives an improved picture compared to ignoring the disease state, the model still makes the assumption that quality of life is dependent on time and current disease state but not on the time since entry into the current disease state.

Monday, 5 October 2009

Nonparametric inference for competing risks current status data with continuous, discrete or grouped observation times.

Marloes Maathuis and Michael Hudgens have a paper available at arXiv.org. This concerns nonparametric inference for competing risks current status data. They consider a naive estimator, which estimates each cumulative incidence function independently using the pooled-adjacent-violators algorithm. Additionally, the NPMLE is also considered. The naive estimator, while consistent, does not guarantee that the sum of CIFs is less than 1. Moreover, previous work by Groeneboom et al suggests it is less efficient. However, the naive estimator has the advantage (in addition to being computationally simpler) that the limiting distribution of the estimates of each CIF is known (being the same as standard current status survival data) and results regarding the likelihood ratio statistic (Banerjee and Wellner 2005) can be applied to get confidence intervals for the CIFs. These results are in the case of a smooth observation distribution. The authors note that if subjects can only be observed in a (finite) pre-defined grid of time points then obviously the CIFs can only be estimated at these time points but also, since the number of parameters cannot increase indefinitely, standard n^1/2 asymptotics apply.

Related to this work is the R package MLEcens developed by Marloes Maathuis. This computes the NPMLE for bivariate interval censored data. Special cases include competing risks data and standard survival data. Moreover the implementation seems to run considerably faster than the package Icens.

*Update: A video of Marloes Maathuis demonstrating MLEcens is available here.

**Update: The paper is now published in Biometrika.

Robust Estimation of State Occupancy Probabilities for Interval-Censored Multistate Data: An Application Involving Spondylitis in Psoriatic Arthritis

Tolusso and Cook have a paper in Communications in Statistics - Theory and Methods, based on chapter 4 of the first author's PhD thesis. They propose a method of robust estimation for the state occupancy probabilities in progressive multi-state models when the transition times are interval censored. The method is in the spirit of the Pepe estimator being based on the differences between the marginal survival distributions of state entry or exit times. These marginal survival distributions can be estimated either through the NPMLE using self-consistency algorithms, through weakly parametric piecewise-constant hazard assumptions, or via local-likelihood.
The method is applied to a three-state illness death model, where the absorbing state is death and times of entry into the absorbing state are known exactly. Prevalence in state 1 is estimated by the interval-censored survival estimate of exit from state 1, prevalence in state 3 is estimated through the Kaplan-Meier estimate of overall survival and prevalence in state 2 is based on the difference between these functions.
In this case the method can be thought of as a less computationally intensive alternative to using Frydman and Szarek's NPMLE, with the added advantage that it is not necessary to make the Markov assumption.

An unrecognized problem with the method in the case of exactly known death times is that, for the healthy state survival function, the upper boundary of the censoring interval is not independent of the process. If a patient dies then they will be censored in some interval where is the time of death. However, if they died from state 1, their exit time from state 1 was . Thus the sojourn time in state 1 will tend to be underestimated and consequentially state 2 occupation will be overestimated. The extent of bias will depend on the chance of death from state 1 and the severity of interval censoring.

Tuesday, 22 September 2009

Estimating stroke-free and total life expectancy in the presence of non-ignorable missing values

Van den Hout and Matthews have a new paper in JRSS A. This deals with interval censored data from a three-state disease model where subjects may miss scheduled interviews meaning the disease status is not observed. A joint model for the disease state and an observation indicator is developed, being a continuous time generalisation of Cole et al (2005) that allows information from exact death times to be included. Conditional on the disease state and measured covariates, the observation indicator is governed by a logistic model.
In general the method seems promising for dealing with informative observation when the potential observation times are known.

Though not noted, the model can be expressed as a hidden Markov model. The authors state that the logistic model and the three-state Markov model are estimated separately. It is not made clear how this is achieved since the logistic model depends on the unobserved states of the Markov model. In some cases the missing state will in fact be known, for instance if the sequence is 1,-,1 or 2,-,2. However, for sequences like 1,-,2 or 1,-,3 it is not possible to establish the unobserved state.

The main aim of the analysis is to obtain estimates of life expectancy, disease free life expectancy and post-disease life-expectancy. These are complicated functions of the parameter vector as they involve integrals of transition probabilities. In addition to the method of Aalen et al 1997, Van den Hout and Matthews additionally propose to use a Metropolis algorithm to get confidence intervals for the life expectancies. The resulting intervals have a Bayesian interpretation, being the credible intervals from an improper uniform prior, but will not be invariant to changes in parametrisation. In practice, the intervals may give good frequentist coverage, particularly for large samples. However, Van den Hout and Matthews seem to be implying the intervals have exact coverage (apart from Monte-Carlo error through the Metropolis algorithm) which is a substantial misconception. Moreover, no mention of the procedure being Bayesian is given.

Thursday, 20 August 2009

Joint Modeling of Self-Rated Health and Changes in Physical Functioning

Hubbard, Inoue and Diehr have a new paper in press in JASA. This applies the time-transformation model proposed by Hubbard et al (Biometrics, 2008). The data are panel observed with an assessment of disability at each observation plus a self-rated measure of health. The disability measure is assumed to be a 5 state time non-homogeneous Markov model, with 4 levels of disability and death as an absorbing state. Backward transitions between disability levels are permitted. Two parametric forms for the time transformation were considered: a power transformation implying monotonicity of all intensities with time, and a two parameter transformation implying the intensities are all unimodal. The non-parametric transformation proposed in Hubbard et al (2008) are not considered here.

Disability is jointly modelled with the self-rated measure of health which is dichotomised as healthy or unhealthy. This health outcome may depend on both the current and past values of disability and other covariates. There would be obvious problems of missing data if the past history of disability is included due to the panel observation. The authors only consider models where the health outcome depends on current (+ predicted future) levels of disability but not past levels. Linear logistic models are used to relate the health outcome to the observed levels of disability and other covariates. Rudimentary goodness-of-fit for the multi-state model is carried out using the prevalence-counts method of Gentleman et al (Stats in Med, 1994), while the logistic model is assessed using the Hosmer-Lemeshow test.

Tuesday, 4 August 2009

Model diagnostics for multi-state models

Titman and Sharples have a new review paper in SMMR. This considers methods for assessing fit in parametric, panel observed multi-state models. The primary focus is on the assessment of time homogeneous Markov models, although there is also a section on hidden Markov models that occur if states are considered to be observed with classification error. Methods for fitting more complicated models such as non-homogeneous and random effects models are also reviewed. A simple graphical generalization of the prevalence counts method of Gentleman et al (Stats in Med, 1994) is also developed.

Monday, 3 August 2009

Estimating dementia-free life expectancy for Parkinsons patients using Bayesian inference and microsimulation

Van den Hout and Matthews have a new paper in Biostatistics involving a random-effects Markov model with time (age) dependent intensities. The methodology is close to that used by Pan et al and Wu et al, using a WinBUGS/OpenBUGS Bayesian approach. They use a three-state illness-death model without recovery. A more sophisticated multivariate log-normal random effect on the effects of age on the intensities is used with a Wishart prior is used on the covariance matrix, which is more appropriate than the Gamma(e,e) type priors used by Pan et al. Like their recent Applied Statistics paper, time dependencies in the intensities are accounted for by assuming an individual that is observed at times t1 and t2 has a constant matrix of intensities between those points, but different assumptions are used to calculate life expectancies. The main methodological development is obtaining life expectancy estimates through 'microsimulation.' This is deemed necessary because there are two levels of variation: variation in the posterior of the parameters and variation from the random effects distribution conditional on the parameters. 'Microsimulation' (or simulation) just approximates the integral over the random effects distribution.

Tuesday, 28 July 2009

Nonparametric inference and uniqueness for periodically observed progressive disease models

Beth Griffin and Stephen Lagakos have a new paper in Lifetime Data Analysis. They consider panel observed progressive disease model (chain-of-events) data. The NPMLE estimator under a discrete-time semi-Markov assumption was developed by Sternberg and Satten (Biometrics, 1999). For datasets where individuals are observed at different times, some discretization of the data is required. An issue with the NPMLE is that it is not guaranteed to be unique and therefore reporting a single NPMLE may be misleading. The paper develops procedures for determining which components of the NPMLE are unique based on considering various re-parameterizations of the likelihood. The method is demonstrated on three example datasets including one on bronchiolitis obliterans syndrome in post-lung transplantation patients and one on primary HIV infection. In addition, the authors also provide a more intuitive algorithm for obtaining the NPMLE than the self-consistency algorithm of Sternberg and Satten.

Wednesday, 22 July 2009

On Induced Dependent Censoring for Quality Adjusted Lifetime (QAL) Data in Simple Illness-Death Model

A new paper by Pradhan and Dewanji in Statistics and Probability Letters considers the problem of induced dependent censoring in quality adjusted lifetime data. Quality adjusted survival time and quality adjusted censoring times are correlated even if the raw survival and censoring times are independent. Kaplan-Meier based estimates of QAL using the QA survival and censoring times will therefore be biased. The paper investigates the nature of the correlation and bias for the case of a simple three-state illness-death model. Under a semi-Markov assumption, they show that QA survival and censoring are positively correlated when the healthy state has greater utility than illness, but the correlation is negative if the relative utilities are reversed.

Tuesday, 23 June 2009

About Earthquake Forecasting by Markov Renewal Processes

There is a new paper in Methodology and Computing in Applied Probability by Garavaglia and Pavani. This concerns the forecasting of earthquakes. They analyse data on severe earthquakes in Turkey during the 20th century. Two approaches were considered. Firstly a two-state semi-Markov model is proposed to model the process. The states represent the occurrence of earthquakes of magnitude 5.5 - 6.3 and greater than 6.3 respectively. However, the probability that the next earthquake is of magnitude greater than 6.3 is seen to not depend on the magnitude of the last earthquake. Hence, instead a model based only on the inter-event times is considered using an exponential-Weibull mixture distribution.

Friday, 5 June 2009

Conferences

Below is a list of multi-state modelling related talks at forthcoming conferences:

Joint Statistical Meetings 2009:
Somnath Datta and Ling Lan
Nonparametric Inference in Multistate Models with Interval-Censored Data

Richard Cook
Multistate Analysis of Bivariate Interval-Censored Failure Time Data

Hans C. van Houwelingen and Hein Putter
Dynamic Predicting by Landmarking as an Alternative for Multistate Modeling: An Application to Acute Lymphoid Leukemia Data

Liou Xu, David Snowdon and Richard J. Kryscio
A Markov Transition Model to Dementia with Death as a Competing Event

Wei-Ting Hwang, Neha Vapiwala and Lawrence J. Solin
A Stayer-Mover Mixture Markov Model for Disease Transitions in Early-Staged Breast Cancer Treated with Breast-Conserving Therapy (BCT)

Halina Frydman and Michael Szarek
Estimation of Overall Survival in an Illness-Death Model with Application to the Vertical Transmission of HIV-1

ISCB 30:

Talks:

Michael Lauseker, Jörg Hasford and Andreas Hochhaus
Prediction In Multi-State Models And Its Application In Chronic Myeloid Leukaemia     

Martin Wolkewitz, Arthur Allignol, Martin Schumacher and Jan Beyersmann
Understanding And Avoiding Survival Bias: An Application Of Multistate Models In A Cohort Of Oscar Nominees

Thomas Kneib
Semiparametric Multi-State Models

Giuliana Cortese and Per Kragh Andersen
Internal Time-Dependent Covariates In Competing Risks Models For Bone Marrow Transplant Studies 

Per Kragh Andersen, Kajsa Kvist and Lars Kessing
Effect Of Event-Dependent Sampling Of Recurrent Events.     

Michael Schemper and Alexandra Kaider
Quantifying The Correlation Of Bivariate Survival Times By Means Of A Novel Self-Consistency Approach     

Ronald Geskus, Nicolas Poulin, Hilton Whittle and Maarten Schim van der Loeff
A Markov Cure Model To Compare Progression Of HIV-1 And HIV-2 Infection

Posters:
Qing Wang, Linda Sharples and Nikolaos Demiris
Multi-State Models For The Analysis Of Lung Transplant Data     

Liesbeth de Wreede, Marta Fiocco and Hein Putter
The Analysis Of Multi-State Models By Means Of The Mstate Package     

ISI, Durban:
Invited Paper meeting:
Inference and Prediction in Competing Risks and Multi-State Models
Organiser: Hein Putter
Participants: Martin Schumacher, Bendix Carstensen, Ørnulf Borgan.

Monday, 20 April 2009

Parameter estimation in a model for misclassified Markov data - a Bayesian approach.

Rosychuk and Islam have a paper in Computational Statistics and Data Analysis. This concerns parameter estimation in a two-state recurrent misclassification type hidden Markov model, where the Markov process is assumed to be continuous time and in equilibrium and is observed at discrete, equally spaced time points. A Bayesian approach to estimation is considered via Gibbs sampling. To avoid identifiability issues, the misclassification probabilities are constrained to be below 0.5. An additional issue is the choice of starting values of the transition probabilities for the latent Markov process. Values based on simple correction formulae previously developed by Rosychuk and Thompson appear to perform better than values based on taking naive estimates of the transition probabilities of the observed process.

Monday, 6 April 2009

Competing risks and time-dependent covariates

Cortese and Andersen have a paper available as a research report from the Department of Biostatistics, University of Copenhagen. This concerns the problem of prediction in competing risks models where there are internal time-dependent covariates, meaning the trajectory of the covariate is not predictable, nor is it independent of the development of the disease/mortality process. The authors focus on the case where the time-dependent covariate is binary, and once it has value 1, cannot revert to value 0. They investigate three approaches. The first expands the state space of the competing risks model, having two alive states: 'alive and cov=0' and 'alive and cov=1' and applies standard methods based on Nelson-Aalen and Aalen-Johansen estimators. The two other methods are based on landmarking.

Update: This paper is now published in Biometrical Journal.

Tuesday, 31 March 2009

Robust Estimation of Mean Functions and Treatment Effects for Recurrent Events Under Event-Dependent Censoring and Termination

Richard Cook et al have a new paper in JASA. This concerns the estimation of mean functions for recurrent events under event dependent censoring. They consider several methods, including a multi-state approach using an estimate based on the Aalen-Johansen estimator of the transition intensities using IPCW to correct for dependent censoring.

Tuesday, 24 March 2009

Estimating life expectancy in health and ill health by using a hidden Markov model

Van den Hout, Jagger and Matthews have a paper to appear in JRSS C. The paper applies the misclassification hidden Markov model, developed by Satten and Longini and Jackson and Sharples, to modelling of data on cognitive impairment in the elderly and its effect on mortality. Patients with a cognition score (MMSE) below 22 were considered impaired. However, cognitive decline is considered to be progressive so backwards transitions in the dataset are explained through misclassification.

The main aim of the paper is to estimate life expectancies in the non-impaired and impaired states. As mortality will be highly dependent on age, non-homogeneous transition intensities are required. Rather than employ the standard approach of piecewise constant intensities, the authors instead include age as a log-linear time dependent covariate and assume that an individual observed at ages t and u, for t < u, has constant intensity Q(t) for the interval (t,u). This will clearly result in some degree of bias, particularly if observation times are widely spaced. Life expectancy is then calculated by assuming intensities are constant in 1 year intervals. As this is different from how the data were estimated, the bias may be further compounded.

Rudimentary goodness-of-fit is carried out by comparing estimated survival curves from the HMM with a Cox-regression performed directly on the survival data. It is worth noting that this approach could be problematic in certain circumstances because the HMM is not nested within the Cox-regression model, so there might be discrepancies between the curves even if the HMM is correctly specified.

Monday, 16 March 2009

Nonparametric estimation in an "illness-death" model when the transition times are interval-censored and one transition is not observed.

Frydman, Gerds, Groen and Keiding have a paper available as a research report from the Department of Biostatistics, Copenhagen. The paper develops previous work on the non-parametric estimation of interval-censored multi-state data. Here the data in question follow a progressive three-state "illness-death" model but the ill to death transition is never observed. This is because the data arise from clinical observation and the trial ceases if a patient is observed to be in the illness state. Such an observation scheme has strong similarities with data considered by Duffy et al relating to breast cancer screening where a three-stage unidirectional model was assumed and the intermediate state was pre-clinical detectable breast cancer. No data on pre-clinical to clinical breast cancer transitions were available as interest was in the natural progression of the disease. Duffy et al analysed the data parametrically, assuming a time homogeneous Markov model. In contrast Frydman et al fit a non-homogeneous Markov model non-parametrically. Since all transitions are interval censored, they model the process in discrete time.
Update: A paper broadly based upon the research report has now been published in Biometrical Journal. The supplementary materials also includes R code to implement the proposed algorithm.

Wednesday, 11 March 2009

A multistate approach for estimating the incidence of human immunodeficiency virus by using HIV and AIDS French surveillance data

Sommen, Alioum and Commenges have a new paper in Statistics in Medicine. This applies the penalized likelihood approach to multi-state models, used extensively by the INSERM group, to the area of back-calculation for estimating HIV incidence. The penalization parameters are chosen by minimizing an approximate cross-validation score.

Wednesday, 11 February 2009

Regression analysis of mean quality-adjusted survival time based on pseudo-observations

Gisela Tunes da Silva and John Klein have a new paper in Statistics in Medicine. This applies the pseudo-observations approach to the estimation of mean quality adjusted survival time (QAS). They assume an identity link function for the relationship between mean QAS and covariates at baseline. This direct regression is analogous to similar approaches taken in the competing risks context by Graw, Gerds and Schumacher in Lifetime Data Analysis.

Monday, 2 February 2009

Nonparametric estimation of waiting time distributions in a Markov model based on current status data.

Datta, Lan and Sundaram have a paper available online for the Journal of Statistical Planning and Inference. This further develops approaches to estimating multi-state processes which have a tree structure (i.e. the process is progressive and there is only one possible path to a particular state) under current status observation, first proposed by Datta and Sundaram (Biometrics, 2006). In this paper a method for estimation of the (marginal) waiting time distributions in each state is developed.

Wednesday, 7 January 2009

Pseudo-observations in survival analysis

Per Andersen and Maja Perme have a paper currently available as a research report from the Department of Biostatistics, University of Copenhagen. The paper is a review of the use of pseudo-observations. Pseudo-observations have particular use in multi-state models for allowing covariates to affect transition probabilities directly. This is of particular use if the quantity of interest in an analysis is a transition probability rather than an intensity. Pseudo-residuals obtained from pseudo-observations can also be used to obtain goodness-of-fit diagnostics. The paper gives a good introduction to this promising area of investigation.

Update (08/09): The paper is now published in SMMR.