Updated, Nov 14, 2020

1 Salk Polio Vaccine Trial

See full material here. and here.

Supplementary Exercise 13.3 The 1954 Field Trial of the Salk Poliomyelitis Vaccine from here.

Tha data are from Paul Meier’s The Biggest Public Health Experiment Ever

This link has the Summary of Study Cases by Diagnostic Class and Vaccination Status. Here we will limit to the numbers of children and numbers of cases, and calculate the rates ourselves.

....... Study Popln Reported <<<<<<<<<< Number of Cases of >>>>>>>>>

Group No. Children Cases Polio ParalyticPolio NonParalytic FatalPolio

Vac 200745 82 57 33 24 0

Pla 201229 162 142 115 27 4

NI 338778 182 157 121 36 0

IV 8484 2 2 1 1 0

All 749236 428 358 270 88 4

Vac = Vaccinated; Pla = Placebo; NI = NotInoculated; IV = Incomplete Vaccination

Some 70 reported cases were deemed to be Not Polio (25 in Vac, 20 in Pla, and 25 in NI), are shown in Meier’s table, but omitted here because of space constraints.

The data are based on followup from the children (in grades 1 2 and 3), randomized and vaccinated in the months before the 1954 summer vacation, and followed to the end of December 1954; i.e. over one `polio season’.

Questions

  1. Repeat C&H’s exercises 13.1 and 13.3 using the data of paralytic polio (\(PP\)) instead of those in C&H Table 13.1, i.e., compute point and interval estimates of the in, and the of, the rates of paralytic polio with the Salk vaccine and Placebo [first 2 rows].

With \(M=\)number of vaccinated persons; \(N=\) number of persons in corresponding control group; \(m=\) number of cases among vaccinated persons [treated as the realization of a Poisson random variable]; \(n=\)number of cases among persons in corresponding control group [also treated as the realization of a Poisson random variable]; \(T=\)total number of persons \((T =M + N)\); \(t=\) total number of cases \((t=m + n)\), and \(\lambda = RateRatio,\) …

the probability of \(m\) vaccinated cases, given a total of \(t\) cases, may be expressed as: \[^tC_m \times \bigg\{\frac{M \lambda}{M \lambda + n}\bigg\} ^m \times \bigg\{\frac{N}{M \lambda + n}\bigg\}^{t-m}\] This is the binomial distribution with probability parameter \[\frac{M\lambda}{M\lambda+N}.\]

  1. In this example, how similar would the CI have been if they had used an unconditional approach to the RateRatio? Is this similarity because of the large numbers of events, or because of something else?

  2. As is clear from the headline, [ There are 3 virus strains; results in table are not strain-specific. 90% in headline is.] journalists, and the public, are more interested in the percent efficacy, \(100 \times (1 - RateRatio),\) than in the RateRatio. Therefore, convert the CI for the RateRatio into a CI for the percent efficacy.

Answers

C&H (non-experimental data)

1a.C&H’s exercise 13.1, they denote the Rate Ratio by \(\theta\), the PersonYears by \(Y\), the numbers of events by \(D\), the compared index and reference categories by 1 and 0.

The most likely value for \(\theta\) is \[\frac{D_1/Y_1}{D_0/Y_0} = \frac{28/1875.5}{17/2768.9} = 2.48.\] The standard deviation of the estimate of \(\log[\theta]\) is \[S = \sqrt{1/28 + 1/17} = 0.3075,\] so that the 90% error factor for \(\theta\) is

\[\exp(1.645 \times 0.3075) = 1.66.\] The 90% confidence limits for the rate ratio are 2.48/1.66 = 1.49 (lower limit) and 2.48 \(\times 1.66\) = 4.12 (upper limit).

1a.C&H’s exercise 13.3, Approx. 90% for the Difference in Rates

\[M = \frac{28}{1875.5} - \frac{17}{2768.9} = 0.00893 \ (8.93 \ per \ 1000PY)\] \[S = \sqrt{ \frac{28}{1875.5^2} + \frac{17}{2768.9^2} } = 0.00321 \ (3.21 \ per \ 1000PY)\]

The 90% confidence interval is \[3.21 \pm 1.645 \times S \ = \ 3.65 \ to \ 14.2 \ per \ 1000PY.\] SALK VACCINE TRIAL

Point and interval estimates of the difference in, and the ratio of, the rates of paralytic polio with the Salk vaccine and Placebo [first 2 rows].

  1. using unconditional approach (2 Poisson RVs), as above:

We can count the 200745 and 201229 and as \(Y_1\) = 200745 and \(Y_0\) = 201229 ‘Child-Seasons.’

The most likely value for \(\theta\) is \[\frac{D_1/Y_1}{D_0/Y_0} = \frac{33/200745}{115/201229} = 0.29 \ \ ( \ VACCINE \ EFFICACY \ = \ 71\% \ ).\] The standard deviation of the estimate of \(\log[\theta]\) is \[S = \sqrt{1/33 + 1/115} = 0.1975,\] so that the 90% error factor for \(\theta\) is

\[\exp(1.645 \times 0.1975) = 1.38.\] The 90% confidence limits for the rate ratio are 0.29/1.38 = 0.21 (lower limit) and 0.29 \(\times 1.38\) = 0.40 (upper limit). Thus, the 90% confidence limits for the Vaccine Efficacy (VE) are 1-0.40 = 0.60 = 60% and 1-0.21 = 0.79 = 79%.

Approx. 90% for the Difference in Rates

\[M = \frac{33}{200745} - \frac{115}{201229} = - 40.7 \ cases \ per \ 100000 CS.\] \[S = \sqrt{ \frac{33}{200745^2} + \frac{115}{201229^2} } = 6.05 \ per \ 1000PY)\]

The 90% confidence interval is \[-40.7 \pm 1.645 \times S \ = \ -50.7 \ to \ -30.7 \ cases \ per \ 1000PY.\]

Francis (1955) also used a conditional approach when computing their confidence intervals. See (on the top right corner of the BIOS601 website, under Applications) the chapter on Statistical Methods. This chapter says… (They used \(\lambda\) for the rate ratio. Since we use it for a rate, we will use \(\theta\) for the rate ratio.)

With \(M=\)number of vaccinated persons; \(N=\) number of persons in corresponding control group; \(m=\) number of cases among vaccinated persons [treated as the realization of a Poisson random variable]; \(n=\)number of cases among persons in corresponding control group [also treated as the realization of a Poisson random variable]; \(T=\)total number of persons \((T =M + N)\); \(t=\) total number of cases \((t=m + n)\), and \(\theta = RateRatio,\) … the probability of m vaccinated cases, given a total of t cases, may be expressed as the binomial distribution with ‘\(n\)’ = \(t\) and probability parameter \[\pi = \frac{M \ \theta}{M \ \theta + N}.\] Since in the 1:1 RCT, \(M \approx N \approx 200,000\), we can write it as \[\pi = \frac{\theta}{\theta + 1}.\] And, since the Vaccine Efficacy (VE) is 100 \(\times (1-\theta)\), we have \[\pi = \frac{100 - VE}{100 - VE + 100} = \frac{100 - VE}{200 - VE}.\]

SO, we can calculate a CI for \(\pi\) and then convert to a CI for the VE using the reverse-transform \[ VE = 100 \times \frac{1 - 2 \pi}{1 - \pi}.\]

We can double check our algebra:

Previously, we have a point estimate \(\widehat{VE} = 100 \times (1- 0.29)\) = 71%. But we also have \[ \widehat{\pi} = \frac{33}{33+115} = 33/148 = 0.223,\] so \[ \widehat{VE} = 100 \times \frac{1 - 2 \hat{\pi}}{1 - \hat{\pi}}= 100 \times \frac{1 - 2 \times 0.223}{1 - 0.223} = 100 \times \frac{0.554}{0.777} = 71.3\%.\]

So, now we need to quantify the ‘statistical uncertainty’ of the \(\hat{\pi}\) = 0.223, and translate it to the `statistical uncertainty’ of the \(\widehat{VE}\) = 71.3%.

We have several ways to obtain a 90% CI for \(\pi\). Given that the 33 and the 115 are large enough to employ the CLT, we can simply use 0.0342162 \[SE[\hat{\pi}] = \sqrt{ \hat{\pi} \times (1-\hat{\pi}) / 148 } = \sqrt{0.223 \times 0.777 / 148} = 0.0342,\] so that the 90% CI for \(\pi\) is 0.223 \(\pm 1.645 \times 0.0342\), or 0.167 to 0.279. The last step is to convert this CI to the CI for VE:

\[ VE_{lower} = 100 \times \frac{1 - 2 \times 0.279}{1 - 0.279} = \underline{61.3\%}; \ \ VE_{upper} = 100 \times \frac{1 - 2 \times 0.167}{1 - 0.167} = \underline{80.0\%}.\] Other approaches will yield slightly different limits for \(\pi\),

binom.test(33,145,conf.level=0.90)
## 
##  Exact binomial test
## 
## data:  33 and 145
## number of successes = 33, number of trials = 145, p-value = 2.955e-11
## alternative hypothesis: true probability of success is not equal to 0.5
## 90 percent confidence interval:
##  0.1714766 0.2922572
## sample estimates:
## probability of success 
##              0.2275862

and therefore slightly different limits for \(VE\).

April 13, 1955. The VE = 90% refers to the strain against which the vaccine was most effective. There were 3 strains of Polio, so it was a tri-valent vaccine.

April 13, 1955. The VE = 90% refers to the strain against which the vaccine was most effective. There were 3 strains of Polio, so it was a tri-valent vaccine.

See also the NY Times announcement.

FITTING RATE-RATIO VIA A GLM

We can do it unconditionally, using a (2-parameter) regression model for 2 independent Poisson RVs, or conditionally using a (1-parameter) binomial regression model** for the split of the independent Poisson RVs ie treating the 145 as a fixed number.

In both cases we will treat the numbers of children as child-seasons (CS), and use the Vaccine Arm Indicator as a V = 0/1 regressor variate.

Multiplicative (Rate Ratio) Model:

\[E[n.cases | V] = \lambda_V \times CS_V, \ (V=0,1); \ \lambda_1 = \theta \times \lambda_0\]

2 indpendent Poisson RVs

n.cases = c(33,115); CS=c(200745,201229)/100000 ; V=c(1,0)

fitted.mult.model = glm(n.cases ~ V + offset(log(CS)), 
                        family = poisson)

summary(fitted.mult.model)
## 
## Call:
## glm(formula = n.cases ~ V + offset(log(CS)), family = poisson)
## 
## Deviance Residuals: 
## [1]  0  0
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  4.04566    0.09325   43.38  < 2e-16 ***
## V           -1.24602    0.19748   -6.31  2.8e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4.7903e+01  on 1  degrees of freedom
## Residual deviance: 1.0214e-14  on 0  degrees of freedom
## AIC: 15.924
## 
## Number of Fisher Scoring iterations: 2
(fitted.rate.0 =  # cases per 100,000 Child-Seasons
      round( exp(fitted.mult.model$coefficients[1]))
)
## (Intercept) 
##          57
(fitted.rate.ratio =  # reference category = Placebo  
      round( exp(fitted.mult.model$coefficients[2]),3)
)
##     V 
## 0.288
# CI for VE = 100*(1 - RateRatio)

( round( 100*(1- exp( confint(fitted.mult.model,level=0.90)[2,2:1] )),1) )
## Waiting for profiling to be done...
## 95 %  5 % 
## 60.6 79.4

Binomial regression model for the Binomial split of the sum of independent Poisson RVs i.e., treating the 33+115 = 148 as a fixed number.

n.cases.v =  33; n.cases.p = 115
log.CS.ratio = log(200745/201229) 


fitted.binomial.model = glm(
      cbind(n.cases.v,n.cases.p) ~ 
                      1 + offset(log.CS.ratio), 
                      family = binomial )

summary(fitted.binomial.model)
## 
## Call:
## glm(formula = cbind(n.cases.v, n.cases.p) ~ 1 + offset(log.CS.ratio), 
##     family = binomial)
## 
## Deviance Residuals: 
## [1]  0
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -1.2460     0.1975   -6.31  2.8e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: -1.088e-14  on 0  degrees of freedom
## Residual deviance: -1.088e-14  on 0  degrees of freedom
## AIC: 7.0875
## 
## Number of Fisher Scoring iterations: 3
(fitted.rate.ratio =  # reference category = Placebo  
      round( exp(fitted.binomial.model$coefficients),3)
)
## (Intercept) 
##       0.288
# CI for VE = 100*(1 - RateRatio)

#( round( 100*(1- exp( 

round(100*(1-
       exp(confint(fitted.binomial.model,level=0.90)[2:1])),1)
## Waiting for profiling to be done...
## 95 %  5 % 
## 60.6 79.4

THE ‘2-independent Poissons model’ and the condtional binomial-split model yield IDENTICAL RESULTS.

FITTING RATE-DIFFERENCE VIA A GLM

\[E[n.cases | V] = [\lambda_0 + (\Delta\lambda) \times V ] \times CS, \ (V=0,1); \ \lambda_1 = \lambda_0 + (\Delta\lambda).\] So \[E[n.cases | V] = \lambda_0 \times \underline{CS} + (\Delta\lambda) \times \underline{V \times CS} = \lambda_0 \times \underline{X_1} + (\Delta\lambda) \times \underline{X_2}.\]

( V.CS = V*CS )
## [1] 2.00745 0.00000
fitted.AdditiveRates.model = glm(
      n.cases ~ -1 + CS + V.CS, 
            family = poisson(link="identity") )

summary(fitted.AdditiveRates.model)
## 
## Call:
## glm(formula = n.cases ~ -1 + CS + V.CS, family = poisson(link = "identity"))
## 
## Deviance Residuals: 
## [1]  0  0
## 
## Coefficients:
##      Estimate Std. Error z value Pr(>|z|)    
## CS     57.149      5.329   10.72  < 2e-16 ***
## V.CS  -40.710      6.049   -6.73 1.69e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: Inf  on 2  degrees of freedom
## Residual deviance:   0  on 0  degrees of freedom
## AIC: 15.924
## 
## Number of Fisher Scoring iterations: 2
round(confint(fitted.AdditiveRates.model,level=.90),1)
## Waiting for profiling to be done...
##        5 %  95 %
## CS    48.8  66.4
## V.CS -50.9 -31.0

THE POINT ESTIMATE AND ITS SE CLOSELY MATCH THE HAND-CALCULATED ONES ABOVE.

2 A Controlled Trial of a Human Papillomavirus Type 16 Vaccine – and schematic to illustrate the relation between the Proportion of Vaccinates Cases and the Vaccine Efficacy

See the full artice here

Supplementary Exercise 13.6

Background: Approximately 20 percent of adults become infected with human papillomavirus type 16 (HPV-16). Although most infections are benign, some progress to anogenital cancer. A vaccine that reduces the incidence of HPV-16 infection may provide important public health benefits.

Methods: In this double-blind study, we randomly assigned 2392 young women (defined as females 16 to 23 years of age) to receive three doses of placebo or HPV-16 virus-like-particle vaccine (40 \(\mu\)g per dose), at day 0, month 2, and month 6. Genital samples to test for HPV-16 DNA were obtained at enrollment, one month after the third vaccination, and every six months thereafter. Women were referred for colposcopy according to a protocol. Biopsy tissue was evaluated for cervical intraepithelial neoplasia and analyzed for HPV-16 DNA with use of the polymerase chain reaction. The primary end point was persistent HPV-16 infection, defined as the detection of HPV-16 DNA in samples obtained at two or more visits. The primary analysis was limited to women who were negative for HPV-16 DNA and HPV-16 antibodies at enrollment and HPV-16 DNA at month 7.

Results: The women were followed for a median of 17.4 months after completing the vaccination regimen. The incidence of persistent HPV-16 infection was 3.8 per 100 woman-years at risk in the placebo group and 0 per 100 woman-years at risk in the vaccine group (100 percent efficacy; 95 percent confidence interval, 90 to 100; P$<$0.001).All nine cases of HPV-16-related cervical intraepithelial neoplasia occurred among the placebo recipients.

Conclusions: Administration of this HPV-16 vaccine reduced the incidence of both HPV-16 infection and HPV-16-related cervical intraepithelial neoplasia. Immunizing HPV-16-negative women may eventually reduce the incidence of cervical cancer.

(from article in N Engl J Med 2002;347:1645-51.)

Questions

1.. Why this design rather than a ‘fixed number of woman-years-of-follow-up’ design?

2.. Let \(I\) denote incidence, \(v\) denote the vaccinated and \(u\) the unvaccinated, [Clayton and Hills use the general letters \(\lambda_v\) and \(\lambda_u\)]. Let \(IR\) denote the incidence ratio \(I_v / I_u\), [Clayton and Hills use the general letter \(\theta\)]. `Vaccine efficacy’ (\(E\)) is defined as a percentage \[E = 100 \times (I_u - I_v) / I_u = 100 \times (1 - I_v / I_u ) = 100 \times (1 - IR) = 100 \times (1 - \theta)\]

Consider a R.C.T., so that random variation is not an issue. A fraction \(F_v\) received the vaccine, and the average follow-up time was \(PT_V\) units; the remaining fraction \(F_u =1-F_v\) received the placebo, and the average follow-up time was \(PT_U\) units. Denote the number of cases (of persistent infection) in the un-vaccinated sub-population by \(C_{u}\) and the corresponding number of cases in the vaccinated sub-population by \(C_{v}\). Let \(C_{total}\) = \(C_{u}\) + \(C_{v}\).

[Clayton and Hill use the letter \(D\), presumably to stand for numbers of eaths; we use the more general letter \(C\) for \underline{c}ases of' , i.e.,transitions’ from the initial state (HPV-) to the state the vaccine is intended to prevent (persistent HPV+).] Let \(\Pi = C_{v} / C_{total}\) denote the (theoretical) proportion of proportion of all cases that had been vaccinated.

\(\bullet\) Assuming the rate ratio remains constant over time, express the parameter \(\Pi\) as a function of (a) the design parameters \(F_v\) and \(PT_v/PT_u\), and the parameter \(IR\) (or \(\theta\)), (b) the design parameters \(F_v\) and \(PT_v/PT_u\), and the parameter \(E\).

\(\bullet\) Show the mathematical link between these parameters and the ones C&H use at the top of page 127.

3.. In the actual study cited above, the primary per-protocol efficacy analysis was based on observing persistent HPV-16 infection in 0 of 768 vaccinated women followed for 1084.0 woman years (w-y) and 41 in 765 unvaccinated women followed for 1076.9 women years (rate 3.8 per 100w-y).

This problem is similar in structure to that analyzed by C&H in their exercise 13.1 & 13.3. Examine the methods they used (their solutions are on p. 131-132.) Is it possible to use C&H’s method with the HPV data?

If so, carry them out. If not, describe what other approach(es) is(are) possible, and carry it(them) out.

4.. (In point/bullet form) what are the take-home data-analysis messages from this exercise?

Answers

1.. Because statistical precision and power depend on (for the log of the rate ratio) the numbers of vaccinated and unvaccinated cases, not on the numbers of women followed. As we will see with the COVID0-19 vaccine trials, this has become standard practice.

2.. This is a more general version of the design of the Polio trial. In at least one of the COVID vaccine trials, the randomization to Vaccine:Placebo is 2:1 rather than 1:1, and so the ‘vaccinated’’ and ‘unvaccinated’ PT at risk may not be equal.

Schematically, here is the more general version

fraction.vaccinated = 2/3

VE = 90
Incidence.Rate.Ratio = 1 - VE/100

par(mfrow=c(1,1), mar=c(0,0,0,0))

plot(c(0,1),c(-0.30,1.05),col="white")


rect(fraction.vaccinated,0,1,1, col="red")
segments(fraction.vaccinated,Incidence.Rate.Ratio,
         1, Incidence.Rate.Ratio,lwd=1, col="white")
rect(0,0,fraction.vaccinated,Incidence.Rate.Ratio, col="red")
rect(0,Incidence.Rate.Ratio,fraction.vaccinated,1, col="White")
text(fraction.vaccinated/2, 0,
     "F = fraction\nvaccinated", 
     adj=c(0.5,1.1), cex=2)
text(fraction.vaccinated+(1-fraction.vaccinated)/2, 0,
     "1 - F = fraction\nnot vaccinated", 
     adj=c(0.5,1.1), cex=2)
text(fraction.vaccinated/2,  
     Incidence.Rate.Ratio + (1-Incidence.Rate.Ratio)/2,
     "Cases Prevented\nBy Vaccination", cex=2)
text(fraction.vaccinated/2,  
     Incidence.Rate.Ratio/2,
     "Cases in the Vaccinated", cex=2)
text(fraction.vaccinated + (1-fraction.vaccinated)/2,  
     Incidence.Rate.Ratio + (1-Incidence.Rate.Ratio)/2,
     "Cases in the\nNot-Vaccinated", cex=2)

arrows(fraction.vaccinated, 0.005, 
       fraction.vaccinated,Incidence.Rate.Ratio-0.005,
       code=3,length=0.10,angle=40,lwd=1.5)
arrows(fraction.vaccinated, Incidence.Rate.Ratio+0.005, 
       fraction.vaccinated,1-0.005,
       code=3,length=0.10,angle=40,lwd=1.5)
arrows(0.45, 0.25, 
       fraction.vaccinated, Incidence.Rate.Ratio/2)
text(0.45, 0.25, "100 - VE", adj=c(1.1,0.5),cex=2)

arrows(0.45, 0.75, 
       fraction.vaccinated, Incidence.Rate.Ratio+
         (1-Incidence.Rate.Ratio)/2)
text(0.45, 0.75, "100 x (1-RateRatio) = VE", adj=c(1.1,0.5),cex=2)

text(0, 1, "100%", adj=c(0.5,0),cex=2)

text(0.02, -0.2, "P =",cex=2, col="red")

text(0.25, -0.2, "No. of Cases in Vaccinated\n-------------------------------------\nTotal No. of Cases",
     cex=2, col="red")
text(0.5, -0.2, "=",cex=2,  col="red")
text(0.75, -0.2, "F x (100 - VE)\n-------------------------------------\nF x (100 - VE) + (1-F) x 100",
     cex=2, col="red")
Relation between Vaccine Efficacy (VE) and the proportion (P) of cases that had been vaccinated, in the case where F=2/3, i.e., 2:1 Vaccine: Placebo allocation. If F = 1/2, then P - (100-VE)/(200-VE).

Relation between Vaccine Efficacy (VE) and the proportion (P) of cases that had been vaccinated, in the case where F=2/3, i.e., 2:1 Vaccine: Placebo allocation. If F = 1/2, then P - (100-VE)/(200-VE).

3.. In the actual study cited above, the primary per-protocol efficacy analysis was based on observing persistent HPV-16 infection in 0 of 768 vaccinated women followed for 1084.0 woman years (w-y) and 41 in 765 unvaccinated women followed for 1076.9 women years (rate 3.8 per 100w-y).

This problem is similar in structure to that analyzed by C&H in their exercise 13.1 & 13.3. But it is not possible to use C&H’s method, becasue the SE for the log of the rate ratio, i.e., \[S = \sqrt{1/0 + 1/43}\] (that one would use to compute a z-based CI) is not finite. The exact afforded by the Binomial distribution that arises from the conditional approach method is a way out of this problem, at least in providing a lower bound on the VE.

In this approach, the 0/43 is seen as a realization of a binomuial r.v. where \(n =\) 43, \(F\) = 1084.0 / (1084.0+1076.9) \(\approx\) 0.5, and \[\pi = \frac{F \times (100-VE)}{P \times (100-VE) + (1-F) \times 100} \approx \frac{100-VE}{200-VE}.\] Based on the 0/43, the upper ,limit of the 95% CI for \(\pi\) obtained by

binom.test(0,43)
## 
##  Exact binomial test
## 
## data:  0 and 43
## number of successes = 0, number of trials = 43, p-value = 2.274e-13
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.00000000 0.08221112
## sample estimates:
## probability of success 
##                      0

is 0.082. Back-transforming it to the VE scale, we get \[ VE_{lower} = 100 \times \frac{1 - 2 \pi_{upper}}{1 - \pi_{upper}}= 100 \times \frac{1 - 2 \times 0.082}{1 - 0.082} = 100 \times \frac{0.836}{0.918} = 91\%.\]

Thus, based on this methods of obtaining a binomial-based CI, the 95% CI is 91% to 100%.

4.. The take-home data-analysis message from these 2 exercises is that even though in many instances the unconditional approach and a z-based CI in the logRatio scale can provide a CI, the approach where we condition on the total number of cases results in a simpler model whose single parameter (\(\pi\)) is a easily inverted function of the parameter of interest, VE. This total number of cases is also waht drives the precision/power of the trial, and it is a key design characteristic: it determines when the trial results are unblinded.

3 Pfizer/BioNTech’s Protocol: BACK-CALCULATIONS; Bayesian CREDIBLE Interval vs. Frequentist CONFIDENCE Interval

PHASE 1/2/3, PLACEBO-CONTROLLED, RANDOMIZED, OBSERVER-BLIND, DOSE-FINDING STUDY TO EVALUATE THE SAFETY, TOLERABILITY, IMMUNOGENICITY, AND EFFICACY OF SARS-COV-2 RNA VACCINE CANDIDATES AGAINST COVID-19 IN HEALTHY INDIVIDUALS

See the Nov. 9 announcement and the (revised Nov2020) Protocol.

The total number of infections is 94, and the claimed VE is over 90%.

So, BACK-CALCULATING we can put an upper bound on how many of these were in the Vaccine arm

n.infections = 94  #  (if interested, set to 43 for the HPV trial)
n.infections.in.V.arm = 0:9

P = round(n.infections.in.V.arm / n.infections,3)
VE = round(100 * (1-2*P)/(1-P),1)
CI.Pi = lapply(n.infections.in.V.arm,binom.test, n=n.infections)
extract.limits=function(x) x$conf.int 
CI.p = round(
  t(matrix(unlist(lapply(CI.Pi,extract.limits)),2,length(VE))),3)
VE.lower = round(100 * (1-2*CI.p[,2])/(1-CI.p[,2]) ,1) 
VE.upper = round(100 * (1-2*CI.p[,1])/(1-CI.p[,1]) ,1) 
cbind(n.infections.in.V.arm, VE, P, CI.p, 
      VE.lower, VE.upper)
##       n.infections.in.V.arm    VE     P             VE.lower VE.upper
##  [1,]                     0 100.0 0.000 0.000 0.038     96.0    100.0
##  [2,]                     1  98.9 0.011 0.000 0.058     93.8    100.0
##  [3,]                     2  97.9 0.021 0.003 0.075     91.9     99.7
##  [4,]                     3  96.7 0.032 0.007 0.090     90.1     99.3
##  [5,]                     4  95.5 0.043 0.012 0.105     88.3     98.8
##  [6,]                     5  94.4 0.053 0.017 0.120     86.4     98.3
##  [7,]                     6  93.2 0.064 0.024 0.134     84.5     97.5
##  [8,]                     7  92.0 0.074 0.030 0.147     82.8     96.9
##  [9,]                     8  90.7 0.085 0.037 0.161     80.8     96.2
## [10,]                     9  89.4 0.096 0.045 0.174     78.9     95.3

HOW SIMILAR IS THEIR (Bayesian) CREDIBLE INTERVAL TO A (Frequentist) CONFIDENCE INTERVAL?

From Section 9.5 of Nov2020 protocol

Bayesian approaches require specification of a prior distribution for the possible values of the unknown vaccine effect, thereby accounting for uncertainty in its value. A minimally informative beta prior, beta (0.700102, 1), is proposed for θ = (1-VE)/(2-VE). The prior is centered at θ = 0.4118 (VE=30%) which can be considered pessimistic. The prior allows considerable uncertainty; the 95% interval for θ is (0.005, 0.964) and the corresponding 95% interval for VE is (-26.2, 0.995).

Let is assume that of the 94 cases of COVID, 5 were in the V arm.

n.cases   = 94
n.cases.v =  5

( CI.Pi = binom.test(n.cases.v,n.cases)$conf.int )
## [1] 0.01749441 0.11977880
## attr(,"conf.level")
## [1] 0.95
CI.Pi[3] = n.cases.v/n.cases 
       # point est. of Pi=Prob(Case had been in V arm)

Pi.To.VE = function(x) 100*(1-2*x)/(1-x)

order = c(2,3,1) # so in increasing VE order

( CI.VE = round(unlist(lapply(CI.Pi, Pi.To.VE )),1)[order] )
## [1] 86.4 94.4 98.2
prior.pi.a = 0.70 # see above excerpt
prior.pi.b = 1.00

( mean.prior.pi =  prior.pi.a/(prior.pi.a + prior.pi.b) )
## [1] 0.4117647
# Note, this is the MEAN, not the CENTER

( prior.CRI.Pi = qbeta(c(0.025,0.5, 0.975), 
      prior.pi.a,            prior.pi.b) ) # NOTE MEDIAN 0.37
## [1] 0.005144496 0.371498572 0.964477961
( prior.CRI.VE = round(unlist(lapply(prior.CRI.Pi, Pi.To.VE )),1) )
## [1]    99.5    40.9 -2615.2
( posterior.CRI.Pi = qbeta(c(0.025,0.975), 
       prior.pi.a+n.cases.v, prior.pi.b+(n.cases-n.cases.v)) )
## [1] 0.02168671 0.11468312
   posterior.CRI.Pi[3] = n.cases.v/n.cases # point est.

( posterior.CRI.VE = round(unlist(
     lapply(posterior.CRI.Pi, Pi.To.VE )),1)[order] )
## [1] 87.0 94.4 97.8
( CI.VE )
## [1] 86.4 94.4 98.2

4 The AstraZeneca/Oxford trial: Is randomizing 2 Vaccine : 1 Placebo. How does that affect (a) the efficiency (b) the analysis? Checking the Power Calculation

see the protocol

Suppose \(n.v\) and \(n.p\) are the numbers of cases of infection in the V and P arms.

POINT ESTIMATE OF VE

The point estimate of the Rate Ratio is \[\frac{n.v \ / \ 2}{n.p},\] so the point estimate of VE is \[1 - \frac{n.v}{2 \times n.p}.\]

If you calculate what proportion \(p\) of the infected cases had been vaccinated, i.e. \(p = n.v/(n.v+n.p),\) then it is related to the VE point estimate as follows is \[p = \frac{ (2/3) \times (100 - VE)}{(2/3) \times (100 - VE) + (1/3) \times 100 } = \frac{200-2 \times VE}{300 - 2 \times VE}.\] The reverse transform (which you would need if you wish to use invert a binomial-based CI) is \[ VE = 100 \times \frac{2 - 3 \times p}{2 - 2 \times p}.\]

PRECISION / EFFICIENCY

The variance of the log of the Rate Ratio is \[1/n.v + 1/n.p.\]

The table below uses expected numbers under VE = 40(10)90% efficacy, if \(n.total\) = 150, the final number at which the study will be unblinded.

F.v is the proportion of subjects randomized to V.

VE = seq(40,90,10)
RR = 100-VE

for( V.P.ratio in seq(1,4,0.5) ) {
  
   F.v = V.P.ratio/(V.P.ratio+1)

   E.n.v = 150 * F.v*(100 - VE) / ( F.v*(100 - VE) + (1 - F.v)*100 )
   E.n.p = 150 - E.n.v
   
   Var.log.Ratio  = 1/E.n.v + 1/E.n.p
   SE.log.Ratio = sqrt( Var.log.Ratio )
   MME = exp( qnorm(0.975) * SE.log.Ratio )
   
   RR.upper = RR * MME
   VE.lower = 100 - RR.upper
   
   #results = t(matrix( c(V.P.ratio, F.v, E.n.v, E.n.p, 
   #                  Var.log.Ratio, SE.log.Ratio, MME   ),7,3 ))
   
   if( V.P.ratio==1) print(
      noquote(".................   Lower Limit if VE point estimate is  ."))
   if( V.P.ratio==1) print( noquote("V.P.ratio     F.v   40%    50%    60%    70%    80%    90%" ) )
   if( V.P.ratio==1) print( noquote(" "))
   print(noquote( c(V.P.ratio, "......", round(F.v,2), 
                                round(VE.lower) ) ) )
}
## [1] .................   Lower Limit if VE point estimate is  .
## [1] V.P.ratio     F.v   40%    50%    60%    70%    80%    90%
## [1]  
## [1] 1      ...... 0.5    16     30     43     56     69     83    
## [1] 1.5    ...... 0.6    17     31     44     58     71     84    
## [1] 2      ...... 0.67   17     31     45     58     71     85    
## [1] 2.5    ...... 0.71   17     31     45     59     72     85    
## [1] 3      ...... 0.75   16     31     45     59     72     85    
## [1] 3.5    ...... 0.78   16     30     45     59     72     86    
## [1] 4      ...... 0.8    15     30     44     59     72     86

There is little loss of precision even as the randomization departs from 1:1.

Effect of unequal sample sizes on the precision of estimated differences in simpler situations where the unit variance is the same in both arms.

The \(1/n.cases.in.vaccinated.arm + 1/n.cases.in.placebo.arm\) variance of the log.ratio is minimized when the two numbers are closer to a 50:50 ratio, so if VE is expected to be large, it may make statistical sense to have the V arm be larger than the P arm.

Checking the Power Calculation in the Astra Zeneca Trial

# AZD1222 

D = 150 # C&H notation for no. of Cases

alpha = 0.05

V.NOT = 2/1

E.null = 0.3

V.P = 2

( P.case.was.V.null = V.P *(1-E.null) / 
                   (  V.P *(1-E.null) + 1*1 ) )
## [1] 0.5833333
7/12
## [1] 0.5833333
n.cases.in.V.arm = 0:D

probs.under.null = dbinom(
  n.cases.in.V.arm, D, P.case.was.V.null )
plot(n.cases.in.V.arm,
     probs.under.null,type="h",lwd=1,
     cex.lab=1.5, cex.axis=1.5,
     ylab="Prob|NULL in black, Prob|ALT in red",
     xlim = c(40,110))

( n.critical = max(n.cases.in.V.arm[
  pbinom(n.cases.in.V.arm, D, P.case.was.V.null) < alpha/2]) )
## [1] 75
points(n.critical,0,pch=19,cex=1)
pbinom(n.critical,D,P.case.was.V.null)
## [1] 0.02400568
binom.test(n.critical,D,P.case.was.V.null)
## 
##  Exact binomial test
## 
## data:  n.critical and D
## number of successes = 75, number of trials = 150, p-value = 0.0465
## alternative hypothesis: true probability of success is not equal to 0.5833333
## 95 percent confidence interval:
##  0.4173574 0.5826426
## sample estimates:
## probability of success 
##                    0.5
E.alt = 0.6

( P.case.was.V.alt = V.P *(1-E.alt) / 
    (  V.P *(1-E.alt) + 1*1 ) )
## [1] 0.4444444
pbinom(n.critical,D,P.case.was.V.alt)
## [1] 0.9263338
probs.under.alt = dbinom(
  n.cases.in.V.arm, D, P.case.was.V.alt )
sig = n.cases.in.V.arm <= n.critical
points(n.cases.in.V.arm[sig]-0.35,
       probs.under.alt[sig],type="h",
       col= "red",
       lwd=0.5)

E.null = D*P.case.was.V.null
txt = paste("Binomial('n'=",toString(D), 
            ", mu = ", toString(E.null),")" )
text(E.null, max(probs.under.null),txt,
     adj=c(-.1,-0.1),cex=1.25)

E.alt = D*P.case.was.V.alt
txt = paste("Binomial('n'=",toString(D), 
            ", mu = ", toString(round(E.alt,1)),")" )
text(E.alt, max(probs.under.null),txt,
     adj=c(1.1,-0.1),cex=1.25,col="red")
Checking the Power Calculation in the Astra Zeneca Trial. 2:1 Randomization to Vaccine:Placebo arms. Null H: Vaccine Efficacy <= 30%. Analyze after 150 Infections. Alternative H: Vaccine Efficacy = 60%.

Checking the Power Calculation in the Astra Zeneca Trial. 2:1 Randomization to Vaccine:Placebo arms. Null H: Vaccine Efficacy <= 30%. Analyze after 150 Infections. Alternative H: Vaccine Efficacy = 60%.