Newer
Older
\href{https://data.snf.ch/grants/grant/189295}{\#189295}).
The code and data to reproduce our analyses is openly available at
\url{https://gitlab.uzh.ch/samuel.pawel/rsAbsence}. A snapshot of the repository
at the time of writing is available at
\url{https://doi.org/10.5281/zenodo.7906792}. We used the statistical
programming language R version \Sexpr{paste(version$major, version$minor, sep =
".")} \citep{R} for analyses. The R packages \texttt{ggplot2}
\citep{Wickham2016}, \texttt{dplyr} \citep{Wickham2022}, \texttt{knitr}
\citep{Xie2022}, and \texttt{reporttools} \citep{Rufibach2009} were used for
plotting, data preparation, dynamic reporting, and formatting, respectively. The
data from the RPCB were obtained by downloading the files from
\url{https://github.com/mayamathur/rpcb} (commit a1e0c63) and extracting the
relevant variables as indicated in the R script \texttt{preprocess-rpcb-data.R}
\section*{Appendix: Sensitivity analyses}
As discussed before, the post-hoc specification of equivalence margins $\Delta$
and prior distribution for the SMD under the alternative $H_{1}$ is debatable.
Commonly used margins in clinical research are much more stringent; for
instance, in oncology, a margin of $\Delta = \log(1.3)$ is commonly used for log
odds/hazard ratios, whereas in bioequivalence studies a margin of
\mbox{$\Delta = \log(1.25) % = \Sexpr{round(log(1.25), 2)}
$} is the convention \citep[chapter 22]{Senn2008}. These margins would
translate into margins of $\Delta = % \log(1.3)\sqrt{3}/\pi =
\Sexpr{round(log(1.3)*sqrt(3)/pi, 2)}$ and $\Delta = % \log(1.25)\sqrt{3}/\pi =
\Sexpr{round(log(1.25)*sqrt(3)/pi, 2)}$ on the SMD scale, respectively, using
the $\text{SMD} = (\surd{3} / \pi) \log\text{OR}$ conversion \citep[p.
233]{Cooper2019}. Similarly, for the Bayesian factor we specified a normal
unit-information prior under the alternative while other normal priors with
smaller/larger standard deviations could have been considered. Here, we
therefore investigate the sensitivity of our conclusions with respect to these
parameters.
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
\begin{figure}[!htb]
<< "sensitivity", fig.height = 6.5 >>=
## compute number of successful replications as a function of the equivalence margin
marginseq <- seq(0.01, 4.5, 0.01)
alphaseq <- c(0.01, 0.05, 0.1)
sensitivityGrid <- expand.grid(m = marginseq, a = alphaseq)
equivalenceDF <- lapply(X = seq(1, nrow(sensitivityGrid)), FUN = function(i) {
m <- sensitivityGrid$m[i]
a <- sensitivityGrid$a[i]
rpcbNull$ptosto <- with(rpcbNull, pmax(pnorm(q = smdo, mean = m, sd = so,
lower.tail = TRUE),
pnorm(q = smdo, mean = -m, sd = so,
lower.tail = FALSE)))
rpcbNull$ptostr <- with(rpcbNull, pmax(pnorm(q = smdr, mean = m, sd = sr,
lower.tail = TRUE),
pnorm(q = smdr, mean = -m, sd = sr,
lower.tail = FALSE)))
successes <- sum(rpcbNull$ptosto <= a & rpcbNull$ptostr <= a)
data.frame(margin = m, alpha = a,
successes = successes, proportion = successes/nrow(rpcbNull))
}) %>%
bind_rows()
## plot number of successes as a function of margin
nmax <- nrow(rpcbNull)
bks <- c(0, 3, 6, 9, 11, 15)
labs <- paste0(bks, " (", round(bks/nmax*100, 0), "%)")
rpcbSuccesses <- 11
marbks <- c(0, margin, 1, 2, 3, 4)
plotA <- ggplot(data = equivalenceDF,
aes(x = margin, y = successes,
color = factor(alpha, ordered = TRUE, levels = rev(alphaseq)))) +
facet_wrap(~ 'italic("p")["TOST"] <= alpha ~ "in original and replication study"',
labeller = label_parsed) +
geom_vline(xintercept = margin, lty = 3, alpha = 0.4) +
annotate(geom = "segment", x = margin + 0.25, xend = margin + 0.01, y = 2, yend = 2,
arrow = arrow(type = "closed", length = unit(0.02, "npc")), alpha = 0.9,
color = "darkgrey") +
annotate(geom = "text", x = margin + 0.28, y = 2, color = "darkgrey",
label = "margin used in main analysis",
size = 3, alpha = 0.9, hjust = 0) +
geom_hline(yintercept = rpcbSuccesses, lty = 2, alpha = 0.4) +
annotate(geom = "segment", x = 0.1, xend = 0.1, y = 13, yend = 11.2,
arrow = arrow(type = "closed", length = unit(0.02, "npc")), alpha = 0.9,
color = "darkgrey") +
annotate(geom = "text", x = -0.04, y = 13.5, color = "darkgrey",
label = "non-significance criterion successes",
size = 3, alpha = 0.9, hjust = 0) +
geom_step(alpha = 0.8, linewidth = 0.8) +
scale_y_continuous(breaks = bks, labels = labs) +
scale_x_continuous(breaks = marbks) +
coord_cartesian(xlim = c(0, max(equivalenceDF$margin))) +
labs(x = bquote("Equivalence margin" ~ Delta),
y = "Successful replications",
color = bquote("threshold" ~ alpha)) +
theme_bw() +
theme(panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
strip.background = element_rect(fill = alpha("tan", 0.4)),
strip.text = element_text(size = 12),
legend.position = c(0.85, 0.25),
plot.background = element_rect(fill = "transparent", color = NA),
legend.box.background = element_rect(fill = "transparent", colour = NA))
## compute number of successful replications as a function of the prior scale
priorsdseq <- seq(0, 40, 0.1)
bfThreshseq <- c(3, 6, 10)
sensitivityGrid2 <- expand.grid(s = priorsdseq, thresh = bfThreshseq)
bfDF <- lapply(X = seq(1, nrow(sensitivityGrid2)), FUN = function(i) {
priorsd <- sensitivityGrid2$s[i]
thresh <- sensitivityGrid2$thresh[i]
rpcbNull$BForig <- with(rpcbNull, BF01(estimate = smdo, se = so, unitvar = priorsd^2))
rpcbNull$BFrep <- with(rpcbNull, BF01(estimate = smdr, se = sr, unitvar = priorsd^2))
successes <- sum(rpcbNull$BForig >= thresh & rpcbNull$BFrep >= thresh)
data.frame(priorsd = priorsd, thresh = thresh,
successes = successes, proportion = successes/nrow(rpcbNull))
}) %>%
bind_rows()
## plot number of successes as a function of prior sd
priorbks <- c(0, 2, 10, 20, 30, 40)
plotB <- ggplot(data = bfDF,
aes(x = priorsd, y = successes, color = factor(thresh, ordered = TRUE))) +
facet_wrap(~ '"BF"["01"] >= gamma ~ "in original and replication study"',
labeller = label_parsed) +
geom_vline(xintercept = 2, lty = 3, alpha = 0.4) +
geom_hline(yintercept = rpcbSuccesses, lty = 2, alpha = 0.4) +
annotate(geom = "segment", x = 7, xend = 2 + 0.2, y = 0.5, yend = 0.5,
arrow = arrow(type = "closed", length = unit(0.02, "npc")), alpha = 0.9,
color = "darkgrey") +
annotate(geom = "text", x = 7.5, y = 0.5, color = "darkgrey",
label = "standard deviation used in main analysis",
size = 3, alpha = 0.9, hjust = 0) +
annotate(geom = "segment", x = 0.5, xend = 0.5, y = 13, yend = 11.2,
arrow = arrow(type = "closed", length = unit(0.02, "npc")), alpha = 0.9,
color = "darkgrey") +
annotate(geom = "text", x = 0.05, y = 13.5, color = "darkgrey",
label = "non-significance criterion successes",
size = 3, alpha = 0.9, hjust = 0) +
geom_step(alpha = 0.8, linewidth = 0.8) +
scale_y_continuous(breaks = bks, labels = labs, limits = c(0, nmax)) +
scale_x_continuous(breaks = priorbks) +
coord_cartesian(xlim = c(0, max(bfDF$priorsd))) +
labs(x = "Prior standard deviation",
y = "Successful replications ",
color = bquote("threshold" ~ gamma)) +
theme_bw() +
theme(panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
strip.background = element_rect(fill = alpha("tan", 0.4)),
strip.text = element_text(size = 12),
legend.position = c(0.85, 0.25),
plot.background = element_rect(fill = "transparent", color = NA),
legend.box.background = element_rect(fill = "transparent", colour = NA))
grid.arrange(plotA, plotB, ncol = 1)
@
\caption{Number of successful replications of original null results in the RPCB
as a function of the margin $\Delta$ of the equivalence test
($p_{\text{TOST}} \leq \alpha$ in both studies for
$\alpha = \Sexpr{rev(alphaseq)}$) or the standard deviation of the zero-mean
normal prior distribution for the SMD effect size under the alternative
$H_{1}$ of the Bayes factor test ($\BF_{01} \geq \gamma$ in both studies for
$\gamma = \Sexpr{bfThreshseq}$).}
\label{fig:sensitivity}
\end{figure}
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
The top plot of Figure~\ref{fig:sensitivity} shows the number of
successful replications as a function of the margin $\Delta$ and for different
TOST \textit{p}-value thresholds. Such an ``equivalence curve'' approach was
first proposed by \citet{Hauck1986}. We see that for realistic margins between
$0$ and $1$, the proportion of replication successes remains below $50\%$ for
the conventional $\alpha = 0.05$ level. To achieve a success rate of
$11/15 = \Sexpr{round(11/15*100, 0)}\%$, as was achieved with the
non-significance criterion from the RPCB, unrealistic margins of $\Delta > 2$
are required, highlighting the paucity of evidence provided by these studies.
Changing the success criterion to a more lenient level ($\alpha = 0.1$) or a
more stringent level ($\alpha = 0.01$) hardly changes the conclusion.
The bottom plot of Figure~\ref{fig:sensitivity} shows a sensitivity analysis
regarding the choice of the prior standard deviation and the Bayes factor
threshold. It is uncommon to specify prior standard deviations larger than the
unit-information standard deviation of $2$, as this corresponds to the
assumption of very large effect sizes under the alternatives. However, to
achieve replication success for a larger proportion of replications than the
observed
$\Sexpr{bfSuccesses}/\Sexpr{ntotal} = \Sexpr{round(bfSuccesses/ntotal*100, 0)}\%$,
unreasonably large prior standard deviations have to be specified. For instance,
a standard deviation of roughly $5$ is required to achieve replication success
in $50\%$ of the replications at a lenient Bayes factor threshold of
$\gamma = 3$. The standard deviation needs to be almost $20$ so that the same
success rate $11/15 = \Sexpr{round(11/15*100, 0)}\%$ as with the
non-significance criterion is achieved. The necessary standard deviations are
even higher for stricter Bayes factor threshold, such as $\gamma = 6$ or
$\gamma = 10$.
\bibliography{bibliography}
<< "sessionInfo1", eval = Reproducibility, results = "asis" >>=
## print R sessionInfo to see system information and package versions
## used to compile the manuscript (set Reproducibility = FALSE, to not do that)
cat("\\newpage \\section*{Computational details}")
@
<< "sessionInfo2", echo = Reproducibility, results = Reproducibility >>=
cat(paste(Sys.time(), Sys.timezone(), "\n"))