Building a Foundation for More Flexible A/B Testing: Applications of Interim Monitoring to Large Scale Data
Volume 21, Issue 2 (2023): Special Issue: Symposium Data Science and Statistics 2022, pp. 412–427
Pub. online: 21 April 2023
Type: Statistical Data Science
Open Access
15 December 2022
15 December 2022
17 April 2023
17 April 2023
21 April 2023
21 April 2023
The use of error spending functions and stopping rules has become a powerful tool for conducting interim analyses. The implementation of an interim analysis is broadly desired not only in traditional clinical trials but also in A/B tests. Although many papers have summarized error spending approaches, limited work has been done in the context of large-scale data that assists in finding the “optimal” boundary. In this paper, we summarized fifteen boundaries that consist of five error spending functions that allow early termination for futility, difference, or both, as well as a fixed sample size design without interim monitoring. The simulation is based on a practical A/B testing problem comparing two independent proportions. We examine sample sizes across a range of values from 500 to 250,000 per arm to reflect different settings where A/B testing may be utilized. The choices of optimal boundaries are summarized using a proposed loss function that incorporates different weights for the expected sample size under a null experiment with no difference between variants, the expected sample size under an experiment with a difference in the variants, and the maximum sample size needed if the A/B test did not stop early at an interim analysis. The results are presented for simulation settings based on adequately powered, under-powered, and over-powered designs with recommendations for selecting the “optimal” design in each setting.
Supplementary material
Supplementary MaterialAll tables and Figures are uploaded as Supplementary Materials.
Armitage P, McPherson C, Rowe B (1969). Repeated significance tests on accumulating data. Journal of the Royal Statistical Society. Series A. General, 132(2): 235–244.
Azevedo EM, Deng A, Montiel Olea Rao JL, Rao J Weyl EG (2020). A/b testing with fat tails. Journal of Political Economy, 128(12): 4614–000.
Balsubramani A, Ramdas A (2015). Sequential nonparametric testing with the law of the iterated logarithm. arXiv preprint:
D’agostino RB, Chase W, Belanger A (1988). The appropriateness of some common procedures for testing the equality of two independent binomial populations. American Statistician, 42(3): 198–202.
Demets DL, Lan KG (1994). Interim analysis: The alpha spending function approach. Statistics in Medicine, 13(13–14): 1341–1352.
Gao P, Ware JH, Mehta C (2008). Sample size re-estimation for adaptive sequential design in clinical trials. Journal of Biopharmaceutical Statistics, 18(6): 1184–1196.
Gordon Lan K, Reboussin DM, DeMets DL (1994). Information and information fractions for design and sequential monitoring of clinical trials. Communications in Statistics. Theory and Methods, 23(2): 403–420.
Haybittle J (1971). Repeated assessment of results in clinical trials of cancer treatment. British Journal of Radiology, 44(526): 793–797.
Johari R, Koomen P, Pekelis L, Walsh D (2022). Always valid inference: Continuous monitoring of a/b tests. Operations Research, 70(3): 1806–1821.
Johari R, Pekelis L, Walsh DJ (2015). Always valid inference: Bringing sequential analysis to a/b testing. arXiv preprint:
Miller E (2010). How Not to Run an A/B Test. URL:
Miller E (2015). Simple Sequential A/B Testing. URL, blog post.
O’Brien PC, Fleming TR (1979). A multiple testing procedure for clinical trials. Biometrics, 549–556.
Pocock SJ (1977). Group sequential methods in the design and analysis of clinical trials. Biometrika, 64(2): 191–199.
Wang SK, Tsiatis AA (1987). Approximately optimal one-parameter boundaries for group sequential trials. Biometrics, 193–199.
Zhou W, Kroehl M, Meier M, Kaizer A (2023). Approaches to analyzing binary data for large-scale A/B testing. Contemporary Clinical Trials Communications, 101091–101091.