Nonparametric estimation of the lifetime and disease onset distributions for a survival-sacriﬁce model

Antonio Eduardo Gomes; Piet Groeneboom; Jon A. Wellner

doi:10.1214/19-EJS1598

Nonparametric estimation of the lifetime and disease onset distributions for a survival-sacriﬁce model

Antonio Eduardo Gomes, Piet Groeneboom, Jon A. Wellner

Statistics

Research output: Contribution to journal › Article › Scientific › peer-review

118 Downloads (Pure)

Abstract

In carcinogenicity experiments with animals where the tumor is not palpable it is common to observe only the time of death of the animal, the cause of death (the tumor or another independent cause, as sacriﬁce) and whether the tumor was present at the time of death. These last two indicator variables are evaluated after an autopsy. Deﬁning the non-negative variables T₁ (time of tumor onset), T₂ (time of death from the tumor) and C (time of death from an unrelated cause), we observe (Y,Δ1,Δ2), where Y = min{T₂,C},Δ₁ =1 {T₁≤C}, and Δ₂ =1 {T₂≤C}. The random variables T₁ and T₂ are independent of C and have a joint distribution such that P(T₁ ≤ T₂) = 1. Some authors call this model a “survival-sacriﬁce model”. [20] (generally to be denoted by LJP (1997)) proposed a Weighted Least Squares estimator for F₁ (the marginal distribution function of T₁), using the Kaplan-Meier estimator of F₂ (the marginal distribution function of T₂). The authors claimed that their estimator is more efficient than the MLE (maximum likelihood estimator) of F₁ and that the Kaplan-Meier estimator is more efficient than the MLE of F₂. However, we show that the MLE of F₁ was not computed correctly, and that the (claimed) MLE estimate of F₁ is even undeﬁned in the case of active constraints. In our simulation study we used a primal-dual interior point algorithm to obtain the true MLE of F₁. The results showed a better performance of the MLE of F₁ over the weighted least squares estimator in LJP (1997) for points where F₁ is close to F₂. Moreover, application to the model, used in the simulation study of LJP (1997), showed smaller variances of the MLE estimators of the ﬁrst and second moments for both F₁ and F₂, and sample sizes from 100 up to 5000, in comparison to the estimates, based on the weighted least squares estimator for F1, proposed in LJP (1997), and the Kaplan-Meier estimator for F₂. R scripts are provided for computing the estimates either with the primal-dual interior point method or by the EM algorithm. In spite of the long history of the model in the biometrics literature (since about 1982), basic properties of the real maximum likelihood estimator (MLE) were still unknown. We give necessary and sfficient conditions for the MLE (Theorem 3.1), as an element of a cone, where the number of generators of the cone increases quadratically with sample size. From this and a self-consistency equation, turned into a Volterra integral equation, we derive the consistency of the MLE (Theorem 4.1). We conjecture that (under some natural conditions) one can extend the methods, used to prove consistency, to proving that the MLE is √n consistent for F₂ and cube root n convergent for F₁, but this has presently not yet been proved.

Original language	English
Pages (from-to)	3195-3242
Number of pages	48
Journal	Electronic Journal of Statistics
Volume	13
Issue number	2
DOIs	https://doi.org/10.1214/19-EJS1598
Publication status	Published - 2019

Access to Document

10.1214/19-EJS1598

euclid.ejs.1569290687Final published version, 709 KBLicence: CC BY

Cite this

@article{87f0bb4fdea8464ebbb94f444da58f15,

title = "Nonparametric estimation of the lifetime and disease onset distributions for a survival-sacriﬁce model",

abstract = "In carcinogenicity experiments with animals where the tumor is not palpable it is common to observe only the time of death of the animal, the cause of death (the tumor or another independent cause, as sacriﬁce) and whether the tumor was present at the time of death. These last two indicator variables are evaluated after an autopsy. Deﬁning the non-negative variables T1 (time of tumor onset), T2 (time of death from the tumor) and C (time of death from an unrelated cause), we observe (Y,Δ1,Δ2), where Y = min{T2,C},Δ1 =1 {T1≤C}, and Δ2 =1 {T2≤C}. The random variables T1 and T2 are independent of C and have a joint distribution such that P(T1 ≤ T2) = 1. Some authors call this model a “survival-sacriﬁce model”. [20] (generally to be denoted by LJP (1997)) proposed a Weighted Least Squares estimator for F1 (the marginal distribution function of T1), using the Kaplan-Meier estimator of F2 (the marginal distribution function of T2). The authors claimed that their estimator is more efficient than the MLE (maximum likelihood estimator) of F1 and that the Kaplan-Meier estimator is more efficient than the MLE of F2. However, we show that the MLE of F1 was not computed correctly, and that the (claimed) MLE estimate of F1 is even undeﬁned in the case of active constraints. In our simulation study we used a primal-dual interior point algorithm to obtain the true MLE of F1. The results showed a better performance of the MLE of F1 over the weighted least squares estimator in LJP (1997) for points where F1 is close to F2. Moreover, application to the model, used in the simulation study of LJP (1997), showed smaller variances of the MLE estimators of the ﬁrst and second moments for both F1 and F2, and sample sizes from 100 up to 5000, in comparison to the estimates, based on the weighted least squares estimator for F1, proposed in LJP (1997), and the Kaplan-Meier estimator for F2. R scripts are provided for computing the estimates either with the primal-dual interior point method or by the EM algorithm. In spite of the long history of the model in the biometrics literature (since about 1982), basic properties of the real maximum likelihood estimator (MLE) were still unknown. We give necessary and sfficient conditions for the MLE (Theorem 3.1), as an element of a cone, where the number of generators of the cone increases quadratically with sample size. From this and a self-consistency equation, turned into a Volterra integral equation, we derive the consistency of the MLE (Theorem 4.1). We conjecture that (under some natural conditions) one can extend the methods, used to prove consistency, to proving that the MLE is √n consistent for F2 and cube root n convergent for F1, but this has presently not yet been proved.",

author = "Gomes, {Antonio Eduardo} and Piet Groeneboom and Wellner, {Jon A.}",

year = "2019",

doi = "10.1214/19-EJS1598",

language = "English",

volume = "13",

pages = "3195--3242",

journal = "Electronic Journal of Statistics",

issn = "1935-7524",

publisher = "Institute of Mathematical Statistics",

number = "2",

}

TY - JOUR

T1 - Nonparametric estimation of the lifetime and disease onset distributions for a survival-sacriﬁce model

AU - Gomes, Antonio Eduardo

AU - Groeneboom, Piet

AU - Wellner, Jon A.

PY - 2019

Y1 - 2019

N2 - In carcinogenicity experiments with animals where the tumor is not palpable it is common to observe only the time of death of the animal, the cause of death (the tumor or another independent cause, as sacriﬁce) and whether the tumor was present at the time of death. These last two indicator variables are evaluated after an autopsy. Deﬁning the non-negative variables T1 (time of tumor onset), T2 (time of death from the tumor) and C (time of death from an unrelated cause), we observe (Y,Δ1,Δ2), where Y = min{T2,C},Δ1 =1 {T1≤C}, and Δ2 =1 {T2≤C}. The random variables T1 and T2 are independent of C and have a joint distribution such that P(T1 ≤ T2) = 1. Some authors call this model a “survival-sacriﬁce model”. [20] (generally to be denoted by LJP (1997)) proposed a Weighted Least Squares estimator for F1 (the marginal distribution function of T1), using the Kaplan-Meier estimator of F2 (the marginal distribution function of T2). The authors claimed that their estimator is more efficient than the MLE (maximum likelihood estimator) of F1 and that the Kaplan-Meier estimator is more efficient than the MLE of F2. However, we show that the MLE of F1 was not computed correctly, and that the (claimed) MLE estimate of F1 is even undeﬁned in the case of active constraints. In our simulation study we used a primal-dual interior point algorithm to obtain the true MLE of F1. The results showed a better performance of the MLE of F1 over the weighted least squares estimator in LJP (1997) for points where F1 is close to F2. Moreover, application to the model, used in the simulation study of LJP (1997), showed smaller variances of the MLE estimators of the ﬁrst and second moments for both F1 and F2, and sample sizes from 100 up to 5000, in comparison to the estimates, based on the weighted least squares estimator for F1, proposed in LJP (1997), and the Kaplan-Meier estimator for F2. R scripts are provided for computing the estimates either with the primal-dual interior point method or by the EM algorithm. In spite of the long history of the model in the biometrics literature (since about 1982), basic properties of the real maximum likelihood estimator (MLE) were still unknown. We give necessary and sfficient conditions for the MLE (Theorem 3.1), as an element of a cone, where the number of generators of the cone increases quadratically with sample size. From this and a self-consistency equation, turned into a Volterra integral equation, we derive the consistency of the MLE (Theorem 4.1). We conjecture that (under some natural conditions) one can extend the methods, used to prove consistency, to proving that the MLE is √n consistent for F2 and cube root n convergent for F1, but this has presently not yet been proved.

AB - In carcinogenicity experiments with animals where the tumor is not palpable it is common to observe only the time of death of the animal, the cause of death (the tumor or another independent cause, as sacriﬁce) and whether the tumor was present at the time of death. These last two indicator variables are evaluated after an autopsy. Deﬁning the non-negative variables T1 (time of tumor onset), T2 (time of death from the tumor) and C (time of death from an unrelated cause), we observe (Y,Δ1,Δ2), where Y = min{T2,C},Δ1 =1 {T1≤C}, and Δ2 =1 {T2≤C}. The random variables T1 and T2 are independent of C and have a joint distribution such that P(T1 ≤ T2) = 1. Some authors call this model a “survival-sacriﬁce model”. [20] (generally to be denoted by LJP (1997)) proposed a Weighted Least Squares estimator for F1 (the marginal distribution function of T1), using the Kaplan-Meier estimator of F2 (the marginal distribution function of T2). The authors claimed that their estimator is more efficient than the MLE (maximum likelihood estimator) of F1 and that the Kaplan-Meier estimator is more efficient than the MLE of F2. However, we show that the MLE of F1 was not computed correctly, and that the (claimed) MLE estimate of F1 is even undeﬁned in the case of active constraints. In our simulation study we used a primal-dual interior point algorithm to obtain the true MLE of F1. The results showed a better performance of the MLE of F1 over the weighted least squares estimator in LJP (1997) for points where F1 is close to F2. Moreover, application to the model, used in the simulation study of LJP (1997), showed smaller variances of the MLE estimators of the ﬁrst and second moments for both F1 and F2, and sample sizes from 100 up to 5000, in comparison to the estimates, based on the weighted least squares estimator for F1, proposed in LJP (1997), and the Kaplan-Meier estimator for F2. R scripts are provided for computing the estimates either with the primal-dual interior point method or by the EM algorithm. In spite of the long history of the model in the biometrics literature (since about 1982), basic properties of the real maximum likelihood estimator (MLE) were still unknown. We give necessary and sfficient conditions for the MLE (Theorem 3.1), as an element of a cone, where the number of generators of the cone increases quadratically with sample size. From this and a self-consistency equation, turned into a Volterra integral equation, we derive the consistency of the MLE (Theorem 4.1). We conjecture that (under some natural conditions) one can extend the methods, used to prove consistency, to proving that the MLE is √n consistent for F2 and cube root n convergent for F1, but this has presently not yet been proved.

UR - http://www.scopus.com/inward/record.url?scp=85073417547&partnerID=8YFLogxK

U2 - 10.1214/19-EJS1598

DO - 10.1214/19-EJS1598

M3 - Article

AN - SCOPUS:85073417547

SN - 1935-7524

VL - 13

SP - 3195

EP - 3242

JO - Electronic Journal of Statistics

JF - Electronic Journal of Statistics

IS - 2

ER -

Nonparametric estimation of the lifetime and disease onset distributions for a survival-sacriﬁce model

Abstract

Access to Document

Other files and links

Fingerprint

Cite this