\documentclass[letterpaper, 12pt]{article} %\usepackage[nomarkers]{endfloat} %%%%%%%%% \usepackage{calc} \usepackage{color} \usepackage{amsmath,amsthm,amssymb} \usepackage{graphicx} \usepackage{float} % Create new "listing" float \newfloat{listing}{tbhp}{lst}%[section] \floatname{listing}{Listing} \newcommand{\listoflistings}{\listof{listing}{List of Listings}} \floatstyle{plaintop} \restylefloat{listing} \usepackage{natbib} %\usepackage{multind} \usepackage{booktabs} \usepackage{enumerate} \usepackage{todonotes} % \usepackage{uarial} % \renewcommand{\familydefault}{\sfdefault} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% to change %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% \usepackage{lineno} %% \linenumber \renewcommand{\baselinestretch}{1} %% 2 \usepackage{color,xcolor} \definecolor{link}{HTML}{004C80} \usepackage[labelfont=bf]{caption} \usepackage[english]{babel} \usepackage[ pdftex, plainpages=false, pdfpagelabels, pagebackref=true, colorlinks=true, citecolor=link, linkcolor=link ]{hyperref} \hypersetup{colorlinks,urlcolor=link} \usepackage{array, lipsum} % \usepackage{datetime} \usepackage[margin=2cm,textwidth=19cm]{geometry} \usepackage{float} \newcommand{\eg}{{e.\,g.\,}} \newcommand{\ie}{{i.\,e.\,}} \newcommand{\pkg}[1]{\textit{#1}} \newcommand{\N}{\mathcal{N}} \setlength\parindent{0pt} %\newcommand{\todo}[1]{\textcolor{red}{#1}} \makeatletter \DeclareRobustCommand*\textsubscript[1]{% \@textsubscript{\selectfont#1}} \def\@textsubscript#1{% {\m@th\ensuremath{_{\mbox{\fontsize\sf@size\z@#1}}}}} \makeatother \begin{document} \begin{center} {\noindent \LARGE \bf Simulation protocol:\\[2mm] Comparison of confidence intervals summarizing\\[2mm] the uncertainty of the combined estimate of a meta-analysis }\\ \bigskip {\noindent \Large Leonhard Held, Felix Hofmann }\end{center} \bigskip \vspace*{.5cm} For the present protocol is inspired by \citet{burt:etal:06} and \citet{morr:etal:19}. The simulation is implemented in \texttt{simulate\_all.R}. \tableofcontents \newpage \section{Aims and objectives}\label{ref:aims} The aim of this simulation study is the comparison of confidence intervals (CIs) summarizing the uncertainty of the combined estimate of a meta-analysis. Specifically, we focus on CIs constructed using $p$-value functions that implement the $p$-value combination methods from \citet{edgington:72} and \citet{fisher:34}. The underlying data sets are simulated as described in Section~\ref{sec:simproc}. In Section~\ref{sec:analysis} we describe which CI construction methods we compare in this simulation study and what criteria we use to evaluate them. \section{Simulation of the data sets} \label{sec:simproc} \subsection{Allowance for failures} We expect no failures, \ie, for all simulated data sets all type of CI methods should lead to a valid CI and all valid CIs should lead to valid CI criteria. If a failure occurs, we stop the simulation and investigate the reason for the failure. \subsection{Software to perform simulations} The simulation study is performed using the statistical software R \citep{R}. We save the output of \texttt{sessionInfo()} giving information on the used version of R, packages, and platform with the simulation results. \subsection{Random number generator} We use the package \pkg{doRNG} \citep{doRNG} with its default random number generator to ensure that random numbers generated inside parallel for loops are independent and reproducible. \subsection{Scenarios to be investigated} \label{sec:scenario} The $1080$ simulated scenarios consist of all combinations of the following parameters: \begin{itemize} \item Higgin's $I^2$ heterogeneity measure $\in \{0, 0.3, 0.6, 0.9\}$. \item Number of studies summarized by the meta-analysis $k \in \{3, 5, 10, 20, 50\}$. \item Publication bias is $\in \{\text{'none'}, \text{'moderate'}, \text{'strong'}\}$ following the terminology of \citet{henm:copa:10}. \item The average study effect $\theta \in \{0.1, 0.2, 0.5\}$. \item The distribution to draw the true study values $\delta_i$ is either 'Gaussian' or 't' with 4 degrees of freedom. The latter still has finite mean and variance, but leads to more 'outliers'. \item The sample size $n_i$ of the $i$-th study (number of patients per study) is $n_i = 50$ (small study) except for 0, 1, or 2 studies where $n_i=500$ (large study). \end{itemize} Note that \citet{IntHoutIoannidis} use a similar setup. \subsection{Simulation details} The simulation of one meta-analysis data set is performed as follows: \begin{enumerate} \item Compute the within-study variance \begin{equation} \label{eq:eps2} \epsilon^2 = \frac{2}{k} \sum\limits_{i=1}^k \frac{1}{n_i}. \end{equation} \item Compute the between-study variance \begin{equation}\label{eq:eq1} \tau^2 = \epsilon^2 \frac{I^2}{1-I^2}. \end{equation} \item For a trial $i$ of the meta-analysis with $k$ trials, $i = 1, \dots, k$: \begin{enumerate} \item Simulate the true effect size using the Gaussian model: $\delta_i \sim \N(\theta, \tau^2)$ or using a Student-$t$ distribution with 4 degrees of freedom such that the samples have mean $\theta$ and variance $\tau^2$. \item Simulate the effect estimates of each trial $y_i \sim \N(\delta_i, \frac{2}{n_i})$. \item Simulate the standard errors of the trial outcomes: $\text{se}_i \sim \sqrt{\frac{\chi^2(2n_i-2)}{(n_i-1)n_i}}$. \end{enumerate} \end{enumerate} \paragraph{Note: The marginal variance}\mbox{}\\ The marginal variance of this simulation procedure is $\tau^2 + 2/n_i$, so follows the additive heterogeneity model as intended. \paragraph{Note: Publication bias}\mbox{}\\ To simulate studies under \textbf{publication bias}, we follow the suggestion of \citet{henm:copa:10} and accept each simulated study with probability \begin{equation} \label{eq:pbias} \exp(-4\, \Phi(-y_i / \text{se}_i)^\gamma ), \end{equation} where $\gamma = 3$ and $\gamma = 1.5$ correspond to \emph{moderate} and \emph{strong} publication bias, respectively. This is, accepted studies are kept and for a rejected study we replace $y_i$ and $\text{se}_i$ by newly simulated values, which are then again accepted with the given probability above. This procedure is repeated until the required number of studies is simulated. However, we assume that only small studies with $n_i = 50$ are subject to publication bias. Thus, larger studies with $n_i = 500$ are always accepted. As described in Section~\ref{sec:scenario}, we set $\theta \in \{0.1, 0.2, 0.5\}$. See the R function \texttt{simREbias()}. In order to check how this implementation of publication bias impacts the simulation performance, we keep track of the mean acceptance probability for each simulation scenario that is subject publication bias. For the calculation of the mean, we also consider large studies with $n = 500$. Since such studies are not subject to publication bias, they have an acceptance probability of 1. \subsection{Simulation procedure} For each scenario in Section~\ref{sec:scenario} we \begin{enumerate} \item simulate 10'000 meta-analysis data sets \item compute the CIs listed in Section~\ref{sec:method} for each meta-analysis \item summarize the performance of the CIs by the criteria listed in Section~\ref{sec:meas} \end{enumerate} \section{Analysis of the confidence intervals} \label{sec:analysis} This section contains an overview over the construction methods for CIs that we consider in this simulation. Moreover, we explain what measures we use in order to compare the different CIs with each other. \subsection{Construction methods for confidence intervals} \label{sec:method} For this project, we will calculate 95\% CIs according to the following methods. \begin{enumerate} \item Hartung-Knapp-Sidik-Jonkman (HK) \citep{IntHoutIoannidis}. \item Random effects model. \item Henmi and Copas (HC) \citep{henm:copa:10}. \item Edgington's method \citep{edgington:72}. \item Fisher's method \citep{fisher:34}. \end{enumerate} \subsection{Definition of the variance estimates} \label{sec:varadj} As we assume an additive heterogeneity model, we will calculate the confidence intervals for methods \emph{Fisher}, \emph{Edgington}, and \emph{Random effects} based on the following estimators for the between-study variance $\tau^2$. The estimator acts thus as an additional scenario that is only applied to the above mentioned methods. \begin{enumerate} \item No heterogeneity, \ie $\tau^2 = 0$. \item DerSimonian-Laird \citep{ders:lair:86}. \item Paule-Mandel \citep{paul:man:82}. \item REML \citep{harv:77}. \end{enumerate} The calculation of the estimates in the simulation will be done using the \texttt{metagen} function from the \texttt{R} package \pkg{meta} \citep{meta}. The adjusted study-specific standard errors are then given by $\text{se}_{\text{adj}}(\hat{\theta_i}) = \sqrt{\text{se}(\hat{\theta_i})^2 + \tau^2}$. \subsection{Measures considered} \label{sec:meas} We assess the CIs using the following criteria \begin{enumerate} \item CI coverage of combined effect, \ie, the proportion of intervals containing the true effect. If the CI does not exist given a specific simulated data set, we treat the coverage as as missing (\texttt{NA}). \item CI width. If there is more than one interval, the width is the sum of the lengths of the individual intervals. If the interval does not exist for a simulated data set, the width will be recorded as missing (\texttt{NA}). %width \item Interval score \citep{Gnei:Raft:07}. If the interval does not exist for a simulated data set, the score will be recorded as missing (\texttt{NA}). % score \item Number of CIs (only for Fisher and Edgington methods). If the interval does not exist for a simulated data set, the number of CIs will be recorded as 0. % n \end{enumerate} Furthermore, we calculate the following measures related to the point estimates. \begin{enumerate} \item Mean squared error (MSE) of the estimator. \item Bias of the estimator. \item Variance of the estimator. \end{enumerate} \paragraph{Note: Uniqueness of the point estimate}\mbox{}\\ As a point estimate for methods \emph{Edgington} and \emph{Fisher}, we use the value where the $p$-value function is maximal. However, this definition does not guarantee the uniqueness of a point estimate. As the computation of the above measures assumes unique point estimates, we record meta-analyses with more than one combined point estimates as missing (\texttt{NA}). \vspace*{.5cm} For the \emph{Edgington} and \emph{Fisher} methods, we also investigate the distribution of the highest value of the $p$-value function between the lowest and the highest treatment effect of the simulated studies. In order to do so, we calculate the following measures: \begin{itemize} \item Minimum \item First quartile \item Mean \item Median \item Third quartile \item Maximum \end{itemize} \vspace*{.5cm} As both methods can result in more than one CI for a given meta-analysis, we record the relative frequency of the number of intervals $m$ over the 10'000 iterations for each of the different scenarios mentioned in Section~\ref{sec:scenario}. However, we truncate the distribution by summarising all events where the number of intervals is $> 9$. \section{ Estimates to be stored for each simulation and summary measures to be calculated over all simulations } For each simulated meta-analysis we construct CIs according to all methods (Section~\ref{sec:method}) and calculate all available assessments (Section~\ref{sec:meas}) for the respective method. For assessments 1-3 in Subsection~\ref{sec:meas} we only store the mean value of all the 10'000 iterations in a specific scenario. Possible missing values (\texttt{NA}) are removed before calculating the mean value. However, we also record the proportion of non-missing values in order to provide an overview over the number of observations used to calculate the mean. The measures related to the point estimates are calculated over the entire sample of the 10'000 iterations. Possible missing values (\texttt{NA}) are removed before the calculations. As for the confidence interval assessments, we also record the proportion of non-missing values. Regarding the distribution of the highest value of the $p$-value function, we store the summary measures mentioned in the respective paragraph of Subsection~\ref{sec:meas}. We calculate the relative frequencies of the number of intervals $m=0, 1, \ldots, 9, >9$ in each confidence set over the 10'000 iterations of the same scenario. Furthermore, we store the mean of the average acceptance probability in each of the 10'000 iterations for all simulation scenarios where there is either 'modest' or 'strong' publication bias. \section{Presentation of the simulation results} For each of the performance measures 1-3 in Subsection~\ref{sec:meas} as well as the mean squared error (MSE), bias, and variance we construct plots with \begin{itemize} \item the number of studies $k$ on the $x$-axis \item the performance measure on the $y$-axis \item one connecting line and color for each value of $I^2$ \item one panel for each CI method \end{itemize} Regarding the distribution of the $p$-value function for the \emph{Edgington} and \emph{Fisher} methods, we will create plots that contain \begin{itemize} \item the number of studies $k$ on the $x$-axis \item the value of the summary statistic on the $y$-axis \item one connecting line and color for each summary statistic \item one panel for each CI method \end{itemize} The plots for the relative frequencies of the number of intervals have \begin{itemize} \item the category ($1$ to $9$ and $>9$) indicating the number of intervals $n$ on the $x$-axis \item the relative frequency on the $y$-axis \item a bar for each category indicating the relative frequency for the respective category \item one panel for each CI method \end{itemize} \newpage \bibliographystyle{apalike} \bibliography{biblio.bib} \end{document}