From 8978fce84594a5d7252adfa1cd0de4160c9c34b0 Mon Sep 17 00:00:00 2001
From: Rachel Heyard <>
Date: Mon, 20 Mar 2023 18:45:34 +0100
Subject: [PATCH] add reference and comment

 rsAbsence.Rnw | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/rsAbsence.Rnw b/rsAbsence.Rnw
index 30ba225..2f2a1f7 100755
--- a/rsAbsence.Rnw
+++ b/rsAbsence.Rnw
@@ -196,7 +196,7 @@ equivalence testing or Bayes factors, should be used.
 The contextualization of null results becomes even more complicated in the
 setting of replication studies. In a replication study, researchers attempt to
 repeat an original study as closely as possible in order to assess whether
-similar results can be obtained with new data. There have been various
+similar results can be obtained with new data \citep{NSF2019}. There have been various
 large-scale replication projects in the biomedical and social sciences in the
 last decade \citep[among
@@ -423,7 +423,7 @@ with confidence intervals from two RPCB study pairs. Both are ``null results''
 and meet the non-significance criterion for replication success (the two-sided
 $p$-values are greater than 5\% in both the original and the replication study),
 but intuition would suggest that these two pairs are very much different.
+\todo[inline]{RH: this data is really a mess. turns out for Dawson n represents the group size (n = 6 in while in Goetz it is the sample size of the whole experiment (n = 34 and 61 in}
 << "2-example-studies", fig.height = 3.25 >>=
 ## some evidence for absence of effect (when a really genereous margin Delta = 1
@@ -619,15 +619,15 @@ established treatment -- is practically equivalent to the established treatment
 whether an effect is practically equivalent to the value of an absent effect,
 usually zero. The main challenge is to specify the margin $\Delta > 0$ that
 defines an equivalence range $[-\Delta, +\Delta]$ in which an effect is
-considered as absent for practical purposes. The goal is then to reject the 
-composite null hypothesis that the true effect is outside the equivalence range. 
-To ensure that the null hypothesis is falsely rejected at most $\alpha \times 
-100\%$ of the time, one either rejects it if the $(1-2\alpha)\times 100\%$ 
-confidence interval for the effect is contained within the equivalence range 
-(for example, a 90\% confidence interval for $\alpha = 5\%$), or if two 
-one-sided tests (TOST) for the effect being smaller/greater than $+\Delta$ 
-and $-\Delta$ are significant at level $\alpha$, respectively. 
-A quantitative measure of evidence for the absence of an effect is then given 
+considered as absent for practical purposes. The goal is then to reject the
+composite null hypothesis that the true effect is outside the equivalence range.
+To ensure that the null hypothesis is falsely rejected at most $\alpha \times
+100\%$ of the time, one either rejects it if the $(1-2\alpha)\times 100\%$
+confidence interval for the effect is contained within the equivalence range
+(for example, a 90\% confidence interval for $\alpha = 5\%$), or if two
+one-sided tests (TOST) for the effect being smaller/greater than $+\Delta$
+and $-\Delta$ are significant at level $\alpha$, respectively.
+A quantitative measure of evidence for the absence of an effect is then given
 by the maximum of the two one-sided $p$-values.
 \todo{CM: maybe more logical to first discuss margin and then mention the
@@ -762,7 +762,7 @@ If the goal of study is to find evidence for the absence of an effect, the
 replication sample size should also be determined so that the study has adequate
 power to make conclusive inferences regarding the absence of the effect.
-\todo{CM: mention that margin + prior distribution should be chosen 
+\todo{CM: mention that margin + prior distribution should be chosen
 before first/second study is conducted?}