Unit 3: Nonparametric Estimation Notes largely based on Statistical ti ti Methods for Reliability Data by WQ W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes. Ramón V. León 9/3/2009 Stat 567: Unit 3 - Ramón V. León 1
Unit 3 Objectives Show the use of the binomial distribution to estimate F(t) from interval and singly right censored data, without assumptions on F(t). This is called nonparametric estimation Explain and illustrate how to compute standard error for F ˆ () t and approximate confidence intervals for F(t) Show how to extend nonparametric estimation to allow for multiply right-censored data Illustrate the Kaplan-Meier nonparametric estimator for data with observations reported as exact failures Describe and illustrate a generalization that provides a nonparametric estimator of F(t) with arbitrary censoring 9/3/2009 Stat 567: Unit 3 - Ramón V. León 2
Data for Plant 1 of the Heat Exchanger Tube Crack Data 9/3/2009 Stat 567: Unit 3 - Ramón V. León 3
A Nonparametric Estimator of F(t i ) Based on Binomial i Theory for Interval Singly- Censored Data 9/3/2009 Stat 567: Unit 3 - Ramón V. León 4
Plant 1 Estimate of CDF 9/3/2009 Stat 567: Unit 3 - Ramón V. León 5
Comments on the Nonparametric Estimate of F(t i ) 9/3/2009 Stat 567: Unit 3 - Ramón V. León 6
Confidence Intervals 9/3/2009 Stat 567: Unit 3 - Ramón V. León 7
Some Characteristic Features of Confidence Intervals The level of confidence expresses one s confidence (not probability) that a specific interval contains the quantity of interest The actual coverage probability is the probability that the procedure will result in an interval containing the quantity of interest A confidence interval is approximate if the specified level of confidence is not equal to the actual coverage probability With censored data most confidence intervals are approximate. Better approximations require more computations 9/3/2009 Stat 567: Unit 3 - Ramón V. León 8
Pointwise Binomial-Based Based Confidence Interval for F(t i ) 9/3/2009 Stat 567: Unit 3 - Ramón V. León 9
Pointwise Normal-Approximation Confidence Interval for F(t i ) 9/3/2009 Stat 567: Unit 3 - Ramón V. León 10
Plant 1 Heat Exchanger Tube Crack Nonparametric Estimate with Conservative Pointwise 95% Confidence Intervals Based on Binomial Theory 9/3/2009 Stat 567: Unit 3 - Ramón V. León 11
Calculation of the Nonparametric Estimate of F(t i ) for Plant 1 from the Heat Exchanger Tube Crack Data 9/3/2009 Stat 567: Unit 3 - Ramón V. León 12
Integrated Circuit (IC) Failure Times in Hours Data from Meeker (1987) Lfp1370.ld 9/3/2009 Stat 567: Unit 3 - Ramón V. León 13
Nonparametric Estimator of F(t) Based on Binomial Theory for Exact Failures and Singly Right Censored Data 9/3/2009 Stat 567: Unit 3 - Ramón V. León 14
JMP Analysis 9/3/2009 Stat 567: Unit 3 - Ramón V. León 15
JMP Analysis Failing 0.020 0.018 0.016 0.014 0.012 0.010 0.008 0.006006 0.004 0.002 0.000 0 100200 300 400 500600 700 800900 1100 1300 Hours 9/3/2009 Stat 567: Unit 3 - Ramón V. León 16
Comments on the Nonparametric Estimate of F(t) 9/3/2009 Stat 567: Unit 3 - Ramón V. León 17
Delta Method and Derivative of the Logit of the CDF Delta Method: 2 Var f ( ˆ ) = f '( ˆ ) Var( ˆ ) Derivative of the Logit Function: x f ( x) log log x log 1 x 1 x f se logit F ˆ Fˆ 1 Fˆ 1 1 1 '( x) x 1 x x(1 x) Fˆ 9/3/2009 Stat 567: Unit 3 - Ramón V. León 18
Pointwise Normal-Approximation Confidence Interval for F(t i ) Based on the Logit Transformation 9/3/2009 Stat 567: Unit 3 - Ramón V. León 19
Pointwise Normal-Approximation Confidence Interval for F(t i ) Based on the Logit itt Transformation 9/3/2009 Stat 567: Unit 3 - Ramón V. León 20
Nonparametric Estimate for the IC Data with Normal Approximation Pointwise 95% Confidence Interval Based on the Logit Transformation 9/3/2009 Stat 567: Unit 3 - Ramón V. León 21
Notation Example n 13 sample size d r i i th 3 # of ffailures in the i interval 2 # of right censored observation at t i-1 i-1 n 7 risk set at t n d r i i 1 j j j 0 j 0 3 pˆ i estimate of the probability of 7 th failing in the i interval given that item has survived to the begining of the interval i 9/3/2009 Stat 567: Unit 3 - Ramón V. León 22
A Nonparametric Estimate of F(t i ) Based on Interval Data and Multiple l Right Censoring 9/3/2009 Stat 567: Unit 3 - Ramón V. León 23
Pooling of the Heat Exchanger Tube Crack Data 9/3/2009 Stat 567: Unit 3 - Ramón V. León 24
Calculation of the Nonparametric Estimate of F(t i ) for the Heat Exchanger Tube Crack Data 0.0133, 0133 0.9867 0.0254, 0.9746 0.0206, 0.9794 9/3/2009 Stat 567: Unit 3 - Ramón V. León 25
Nonparametric Estimate for the Heat Exchanger Tube Crack Data 9/3/2009 Stat 567: Unit 3 - Ramón V. León 26
Approximate Variance of Estimated CDF ˆ ˆ Recall, Ft ˆ( ) 1 St ˆ( ) the Var Ft ( ) Var St ( ) i i i i i i i Also St ˆ( ) 1 pˆ qˆ and St ( ) q i j 1 j j 1 j i j 1 j Then a Taylor series first-order approximation of St ˆ( i ) is i ˆ S St ( i ) St ( ) St ( ) qˆ q j 1 q i i j j j q St ( ) St ( ) q q i i ˆ i j 1 j j q j j 9/3/2009 Stat 567: Unit 3 - Ramón V. León 27
Approximate Variance of Estimated CDF Then it follows that 2 2 ˆ St ( ) St ( ) Var St ( i) ( ) because the qˆ are approximately j qp i i ˆ i i Varq j j j 1 j j 1 q j q j n j uncorrelated binomial proportions. (The qˆ values are asymtotically as n uncorrelated). j ˆ p Var St ( ) St ( ) St ( ) 2 i j 2 i j i i j 1 i j 1 nq j j nj pj p (1 ) 9/3/2009 Stat 567: Unit 3 - Ramón V. León 28
Estimating the Standard Error of the Estimated CDF 9/3/2009 Stat 567: Unit 3 - Ramón V. León 29
Standard Errors for the Estimated CDF of the Heat Exchanger Tube Crack Data 0.0133, 0.9867 0.0254, 0.9616 0.0206, 0.9418 9/3/2009 Stat 567: Unit 3 - Ramón V. León 30
Recall: Pointwise Normal-Approximation Confidence Interval for F(t i ) Based on the Logit Transformation 9/3/2009 Stat 567: Unit 3 - Ramón V. León 31
Normal-Approximation Pointwise Confidence Intervals of the Heat Exchanger Tube Crack Data 9/3/2009 Stat 567: Unit 3 - Ramón V. León 32
9/3/2009 Stat 567: Unit 3 - Ramón V. León 33
9/3/2009 Stat 567: Unit 3 - Ramón V. León 34
JMP Analysis 0 1 2 3 9/3/2009 Stat 567: Unit 3 - Ramón V. León 35
9/3/2009 Stat 567: Unit 3 - Ramón V. León 36
Recall: 9/3/2009 Stat 567: Unit 3 - Ramón V. León 37
Shock Absorber Failure Data First reported in O Connor (1985) Failure times in number of kilometers of use, of vehicle shock absorbers Two failure modes, denoted by M1 and M2 One might be interested in the distribution of time to failure for mode M1, mode M2, or the overall failure-time distribution ib i of the part Data Table C.2 in the Appendix, page 630 Here we do not differentiate between mode M1 and M2. We will estimate the distribution of time to failure by either mode M1 or M2. 9/3/2009 Stat 567: Unit 3 - Ramón V. León 38
9/3/2009 Stat 567: Unit 3 - Ramón V. León 39
Failure Pattern in the Shock Absorber Data: Failure Fil Mode Ignored 9/3/2009 Stat 567: Unit 3 - Ramón V. León 40
9/3/2009 Stat 567: Unit 3 - Ramón V. León 41
Nonparametric Estimates for the Shock Absorber b Data up to 12,220 220 km 9/3/2009 Stat 567: Unit 3 - Ramón V. León 42
9/3/2009 Stat 567: Unit 3 - Ramón V. León 43
JMP Analysis 9/3/2009 Stat 567: Unit 3 - Ramón V. León 44
JMP Analysis 9/3/2009 Stat 567: Unit 3 - Ramón V. León 45
9/3/2009 Stat 567: Unit 3 - Ramón V. León 46
9/3/2009 Stat 567: Unit 3 - Ramón V. León 47
9/3/2009 Stat 567: Unit 3 - Ramón V. León 48
9/3/2009 Stat 567: Unit 3 - Ramón V. León 49
Theory of Simultaneous Confidence Bands 9/3/2009 Stat 567: Unit 3 - Ramón V. León 50
9/3/2009 Stat 567: Unit 3 - Ramón V. León 51
9/3/2009 Stat 567: Unit 3 - Ramón V. León 52
9/3/2009 Stat 567: Unit 3 - Ramón V. León 53
9/3/2009 Stat 567: Unit 3 - Ramón V. León 54
9/3/2009 Stat 567: Unit 3 - Ramón V. León 55
9/3/2009 Stat 567: Unit 3 - Ramón V. León 56
9/3/2009 Stat 567: Unit 3 - Ramón V. León 57
9/3/2009 Stat 567: Unit 3 - Ramón V. León 58
SPLIDA GRAPH: Turbine Wheel Crack Initiation Data with Nonparametric Pointwise 95% Confidence Bands 0.8 0.6 - - - - raction Failing 0.4 - - - - - - Fr 0.2 0 - - - - - - - - - - 10 20 30 40 50 Hundreds of Hours Sat Aug 23 22:36:34 EDT 2003 9/3/2009 Stat 567: Unit 3 - Ramón V. León 59
SPLIDA GRAPH: Turbine Wheel Crack Initiation Data with Nonparametric Simultaneous 95% Confidence Bands 1 0.8 - - - - Fraction Failing 0.6 0.4 - - - - - - - - - 0.2 0 - - - - - - - 10 20 30 40 50 Hundreds of Hours Sat Aug 23 22:31:59 EDT 2003 9/3/2009 Stat 567: Unit 3 - Ramón V. León 60
JMP Analysis 9/3/2009 Stat 567: Unit 3 - Ramón V. León 61
m Combined Start Time End Time Survival Failure SurvStdEr 10.0000 10.0000 0.9302 0.0698 0.0337 14.0000 14.0000 0.9302 0.0698 0.0473 18.0000 18.0000 0.9041 0.0959 0.0345 22.0000 22.0000 0.8333 0.1667 0.0680 26.0000 26.0000 0.7778 0.2222 0.0657 30.00000000 30.00000000 0.7778 0.2222 0.06500650 34.0000 34.0000 0.5385 0.4615 0.1383 38.0000 38.0000 0.4190 0.5810 0.0865 42.0000 42.0000 0.4190 0.5810 0.0766 46.0000 46.0000 0.4165 0.5835 0.0822 9/3/2009 Stat 567: Unit 3 - Ramón V. León 62
Omitted Topic in Chapter 3 Uncertain censoring time Have assumed that censoring takes place at the end of the observation intervals Can assume censoring happens in the middle of the observation intervals Leads to actuarial or life table nonparametric estimate of cdf. See Table 3.6 Page 64. 9/3/2009 Stat 567: Unit 3 - Ramón V. León 63