Accelerating Fixed-Point Algorithms in Statistics and Data Science: A State-of-Art Review
Volume 21, Issue 1 (2023), pp. 1–26
Pub. online: 19 July 2022
Type: Data Science Reviews
Open Access
Received
7 March 2022
7 March 2022
Accepted
25 May 2022
25 May 2022
Published
19 July 2022
19 July 2022
Abstract
Fixed-point algorithms are popular in statistics and data science due to their simplicity, guaranteed convergence, and applicability to high-dimensional problems. Well-known examples include the expectation-maximization (EM) algorithm, majorization-minimization (MM), and gradient-based algorithms like gradient descent (GD) and proximal gradient descent. A characteristic weakness of these algorithms is their slow convergence. We discuss several state-of-art techniques for accelerating their convergence. We demonstrate and evaluate these techniques in terms of their efficiency and robustness in six distinct applications. Among the acceleration schemes, SQUAREM shows robust acceleration with a mean 18-fold speedup. DAAREM and restarted-Nesterov schemes also demonstrate consistently impressive accelerations. Thus, it is possible to accelerate the original fixed-point algorithm by using one of SQUAREM, DAAREM, or restarted-Nesterov acceleration schemes. We describe implementation details and software packages to facilitate the application of the acceleration schemes. We also discuss strategies for selecting a particular acceleration scheme for a given problem.
Supplementary material
Supplementary Material
1.
We provide an R package AccelBenchmark , available from Github that can be used to reproduce the experiments in this paper. Please check the vignette and the demo.R file under vignette folder for reference.
2.
We provide an additional pdf file that includes a) some basic properties of fixed point iterations; b) visualization of the convergence for all the experiments and c) additional analysis for the Sinkhorn scaling problem.
References
Altschuler J, Weed J, Rigollet P (2017). Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. arXiv preprint: https://arxiv.org/abs/1705.09634.
Geist M, Scherrer B (2018). Anderson acceleration for reinforcement learning. arXiv preprint: https://arxiv.org/abs/1809.09501.
Mena G, Belanger D, Linderman S, Snoek J (2018). Learning latent permutations with gumbel-sinkhorn networks. arXiv preprint: https://arxiv.org/abs/1802.08665.
Tseng P (2009). On accelerated proximal gradient methods for convex-concave optimization. 2008, http://www.math.washington.edu/~tseng/papers/apgm.pdf.
Varadhan R, Roland C (2004). Squared extrapolation methods (SQUAREM): A new class of simple and efficient numerical schemes for accelerating the convergence of the EM algorithm. Department of Biostatistics Working Paper 63, Johns Hopkins University. https://biostats.bepress.com/jhubiostat/paper63/.