<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1051</article-id>
<article-id pub-id-type="doi">10.6339/22-JDS1051</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Data Science Reviews</subject></subj-group></article-categories>
<title-group>
<article-title>Accelerating Fixed-Point Algorithms in Statistics and Data Science: A State-of-Art Review</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Tang</surname><given-names>Bohao</given-names></name><xref ref-type="aff" rid="j_jds1051_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Henderson</surname><given-names>Nicholas C.</given-names></name><xref ref-type="aff" rid="j_jds1051_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Varadhan</surname><given-names>Ravi</given-names></name><email xlink:href="mailto:ravi.varadhan@jhu.edu">ravi.varadhan@jhu.edu</email><xref ref-type="aff" rid="j_jds1051_aff_001">1</xref><xref ref-type="aff" rid="j_jds1051_aff_003">3</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1051_aff_001"><label>1</label>Department of Biostatistics, <institution>Johns Hopkins University</institution>, Maryland, <country>USA</country></aff>
<aff id="j_jds1051_aff_002"><label>2</label>Department of Biostatistics, <institution>University of Michigan</institution>, Michigan, <country>USA</country></aff>
<aff id="j_jds1051_aff_003"><label>3</label>Quantitative Sciences Division, Sidney Kimmel Comprehensive Cancer Center, <institution>Johns Hopkins University</institution>, Maryland, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:ravi.varadhan@jhu.edu">ravi.varadhan@jhu.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2023</year></pub-date><pub-date pub-type="epub"><day>19</day><month>7</month><year>2022</year></pub-date><volume>21</volume><issue>1</issue><fpage>1</fpage><lpage>26</lpage><supplementary-material id="S1" content-type="document" xlink:href="jds1051_s001.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<title>Supplementary Material</title>
<p>
<list>
<list-item id="j_jds1051_li_001">
<label>1.</label>
<p>We provide an R package <monospace>AccelBenchmark</monospace>, available from <ext-link ext-link-type="uri" xlink:href="https://github.com/bhtang127/AccelBenchmark">Github</ext-link> that can be used to reproduce the experiments in this paper. Please check the vignette and the <monospace>demo.R</monospace> file under vignette folder for reference.</p>
</list-item>
<list-item id="j_jds1051_li_002">
<label>2.</label>
<p>We provide an additional <monospace>pdf</monospace> file that includes a) some basic properties of fixed point iterations; b) visualization of the convergence for all the experiments and c) additional analysis for the Sinkhorn scaling problem.</p>
</list-item>
</list> 
</p>
</caption>
</supplementary-material><history><date date-type="received"><day>7</day><month>3</month><year>2022</year></date><date date-type="accepted"><day>25</day><month>5</month><year>2022</year></date></history>
<permissions><copyright-statement>2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2023</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Fixed-point algorithms are popular in statistics and data science due to their simplicity, guaranteed convergence, and applicability to high-dimensional problems. Well-known examples include the expectation-maximization (EM) algorithm, majorization-minimization (MM), and gradient-based algorithms like gradient descent (GD) and proximal gradient descent. A characteristic weakness of these algorithms is their slow convergence. We discuss several state-of-art techniques for accelerating their convergence. We demonstrate and evaluate these techniques in terms of their efficiency and robustness in six distinct applications. Among the acceleration schemes, SQUAREM shows robust acceleration with a mean 18-fold speedup. DAAREM and restarted-Nesterov schemes also demonstrate consistently impressive accelerations. Thus, it is possible to accelerate the original fixed-point algorithm by using one of SQUAREM, DAAREM, or restarted-Nesterov acceleration schemes. We describe implementation details and software packages to facilitate the application of the acceleration schemes. We also discuss strategies for selecting a particular acceleration scheme for a given problem.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>convergence acceleration</kwd>
<kwd>EM</kwd>
<kwd>high dimensional models</kwd>
<kwd>MM</kwd>
<kwd>proximal gradient</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1051_reflist_001">
<title>References</title>
<ref id="j_jds1051_ref_001">
<mixed-citation publication-type="other"> <string-name><surname>Altschuler</surname> <given-names>J</given-names></string-name>, <string-name><surname>Weed</surname> <given-names>J</given-names></string-name>, <string-name><surname>Rigollet</surname> <given-names>P</given-names></string-name> (2017). Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. arXiv preprint: <uri>https://arxiv.org/abs/1705.09634</uri>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Anderson</surname> <given-names>DG</given-names></string-name> (<year>1965</year>). <article-title>Iterative procedures for nonlinear integral equations</article-title>. <source><italic>Journal of the ACM (JACM)</italic></source>, <volume>12</volume>(<issue>4</issue>): <fpage>547</fpage>–<lpage>560</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_003">
<mixed-citation publication-type="book"> <string-name><surname>Atkinson</surname> <given-names>KE</given-names></string-name> (<year>1976</year>). <source><italic>A Survey of Numerical Methods for the Solution of Fredholm Integral Equations of the Second Kind</italic></source>, volume <volume>16</volume>. <publisher-name>Society for Industrial and Applied Mathematics</publisher-name>, <publisher-loc>Philadelphia</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Berlinet</surname> <given-names>A</given-names></string-name>, <string-name><surname>Roland</surname> <given-names>C</given-names></string-name> (<year>2009</year>). <article-title>Parabolic acceleration of the EM algorithm</article-title>. <source><italic>Statistics and Computing</italic></source>, <volume>19</volume>(<issue>1</issue>): <fpage>35</fpage>–<lpage>47</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_005">
<mixed-citation publication-type="other"> <string-name><surname>Bobb</surname> <given-names>JF</given-names></string-name>, <string-name><surname>Varadhan</surname> <given-names>R</given-names></string-name> (2021). turboEM: A Suite of Convergence Acceleration Schemes for EM, MM and Other Fixed-Point Algorithms. R package version 2021.1.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Bottolo</surname> <given-names>L</given-names></string-name>, <string-name><surname>Richardson</surname> <given-names>S</given-names></string-name>, <etal>et al.</etal> (<year>2010</year>). <article-title>Evolutionary stochastic search for Bayesian model exploration</article-title>. <source><italic>Bayesian Analysis</italic></source>, <volume>5</volume>(<issue>3</issue>): <fpage>583</fpage>–<lpage>618</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_007">
<mixed-citation publication-type="book"> <string-name><surname>Boyd</surname> <given-names>S</given-names></string-name>, <string-name><surname>Boyd</surname> <given-names>SP</given-names></string-name>, <string-name><surname>Vandenberghe</surname> <given-names>L</given-names></string-name> (<year>2004</year>). <source><italic>Convex Optimization</italic></source>. <publisher-name>Cambridge University Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Carbonetto</surname> <given-names>P</given-names></string-name>, <string-name><surname>Stephens</surname> <given-names>M</given-names></string-name>, <etal>et al.</etal> (<year>2012</year>). <article-title>Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies</article-title>. <source><italic>Bayesian Analysis</italic></source>, <volume>7</volume>(<issue>1</issue>): <fpage>73</fpage>–<lpage>108</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_009">
<mixed-citation publication-type="chapter"> <string-name><surname>Chipman</surname> <given-names>H</given-names></string-name>, <string-name><surname>George</surname> <given-names>EI</given-names></string-name>, <string-name><surname>McCulloch</surname> <given-names>RE</given-names></string-name>, <string-name><surname>Clyde</surname> <given-names>M</given-names></string-name>, <string-name><surname>Foster</surname> <given-names>DP</given-names></string-name>, <string-name><surname>Stine</surname> <given-names>RA</given-names></string-name> (<year>2001</year>). <chapter-title>The practical implementation of Bayesian model selection</chapter-title>. <source><italic>Lecture Notes-Monograph Series</italic></source>, <fpage>65</fpage>–<lpage>134</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Clyde</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Ghosh</surname> <given-names>J</given-names></string-name>, <string-name><surname>Littman</surname> <given-names>ML</given-names></string-name> (<year>2011</year>). <article-title>Bayesian adaptive sampling for variable selection and model averaging</article-title>. <source><italic>Journal of Computational and Graphical Statistics</italic></source>, <volume>20</volume>(<issue>1</issue>): <fpage>80</fpage>–<lpage>101</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_011">
<mixed-citation publication-type="chapter"> <string-name><surname>Cook</surname> <given-names>J</given-names></string-name>, <string-name><surname>Sutskever</surname> <given-names>I</given-names></string-name>, <string-name><surname>Mnih</surname> <given-names>A</given-names></string-name>, <string-name><surname>Hinton</surname> <given-names>G</given-names></string-name> (<year>2007</year>). <chapter-title>Visualizing similarity data with a mixture of maps</chapter-title>. In: <source><italic>Artificial Intelligence and Statistics</italic></source>, <fpage>67</fpage>–<lpage>74</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Dempster</surname> <given-names>AP</given-names></string-name>, <string-name><surname>Laird</surname> <given-names>NM</given-names></string-name>, <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name> (<year>1977</year>). <article-title>Maximum likelihood from incomplete data via the EM algorithm</article-title>. <source><italic>Journal of the Royal Statistical Society: Series B (Methodological)</italic></source>, <volume>39</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>22</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Du</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Varadhan</surname> <given-names>R</given-names></string-name> (<year>2020</year>). <article-title>SQUAREM: An R package for off-the-shelf acceleration of EM, MM and other EM-like monotone algorithms</article-title>. <source><italic>Journal of Statistical Software</italic></source>, <volume>92</volume>: <fpage>1</fpage>–<lpage>41</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Evans</surname> <given-names>C</given-names></string-name>, <string-name><surname>Pollock</surname> <given-names>S</given-names></string-name>, <string-name><surname>Rebholz</surname> <given-names>LG</given-names></string-name>, <string-name><surname>Xiao</surname> <given-names>M</given-names></string-name> (<year>2020</year>). <article-title>A proof that Anderson acceleration improves the convergence rate in linearly converging fixed-point methods (but not in those converging quadratically)</article-title>. <source><italic>SIAM Journal on Numerical Analysis</italic></source>, <volume>58</volume>(<issue>1</issue>): <fpage>788</fpage>–<lpage>810</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Fang</surname> <given-names>Hr</given-names></string-name>, <string-name><surname>Saad</surname> <given-names>Y</given-names></string-name> (<year>2009</year>). <article-title>Two classes of multisecant methods for nonlinear acceleration</article-title>. <source><italic>Numerical Linear Algebra with Applications</italic></source>, <volume>16</volume>(<issue>3</issue>): <fpage>197</fpage>–<lpage>221</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Friedman</surname> <given-names>JH</given-names></string-name> (<year>2001</year>). <article-title>Greedy function approximation: A gradient boosting machine</article-title>. <source><italic>Annals of Statistics</italic></source>, <fpage>1189</fpage>–<lpage>1232</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_017">
<mixed-citation publication-type="other"> <string-name><surname>Geist</surname> <given-names>M</given-names></string-name>, <string-name><surname>Scherrer</surname> <given-names>B</given-names></string-name> (2018). Anderson acceleration for reinforcement learning. arXiv preprint: <uri>https://arxiv.org/abs/1809.09501</uri>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>George</surname> <given-names>EI</given-names></string-name>, <string-name><surname>McCulloch</surname> <given-names>RE</given-names></string-name> (<year>1997</year>). <article-title>Approaches for Bayesian variable selection</article-title>. <source><italic>Statistica Sinica</italic></source>, <fpage>339</fpage>–<lpage>373</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_019">
<mixed-citation publication-type="chapter"> <string-name><surname>Guyon</surname> <given-names>I</given-names></string-name>, <string-name><surname>Gunn</surname> <given-names>SR</given-names></string-name>, <string-name><surname>Ben-Hur</surname> <given-names>A</given-names></string-name>, <string-name><surname>Dror</surname> <given-names>G</given-names></string-name> (<year>2004</year>). <chapter-title>Result analysis of the NIPS 2003 feature selection challenge</chapter-title>. In: <source><italic>NIPS</italic></source>, volume <volume>4</volume>, <fpage>545</fpage>–<lpage>552</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Hasannasab</surname> <given-names>M</given-names></string-name>, <string-name><surname>Hertrich</surname> <given-names>J</given-names></string-name>, <string-name><surname>Laus</surname> <given-names>F</given-names></string-name>, <string-name><surname>Steidl</surname> <given-names>G</given-names></string-name> (<year>2021</year>). <article-title>Alternatives to the EM algorithm for ML estimation of location, scatter matrix, and degree of freedom of the Student t distribution</article-title>. <source><italic>Numerical Algorithms</italic></source>, <volume>87</volume>(<issue>1</issue>): <fpage>77</fpage>–<lpage>118</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Hasselblad</surname> <given-names>V</given-names></string-name> (<year>1966</year>). <article-title>Estimation of parameters for a mixture of normal distributions</article-title>. <source><italic>Technometrics</italic></source>, <volume>8</volume>(<issue>3</issue>): <fpage>431</fpage>–<lpage>444</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_022">
<mixed-citation publication-type="other"> <string-name><surname>Henderson</surname> <given-names>N</given-names></string-name>, <string-name><surname>Varadhan</surname> <given-names>R</given-names></string-name> (2020). daarem: Damped Anderson Acceleration with Epsilon Monotonicity for Accelerating EM-Like Monotone Algorithms. R package version 0.5.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Henderson</surname> <given-names>NC</given-names></string-name>, <string-name><surname>Varadhan</surname> <given-names>R</given-names></string-name> (<year>2019</year>). <article-title>Damped Anderson acceleration with restarts and monotonicity control for accelerating EM and EM-like algorithms</article-title>. <source><italic>Journal of Computational and Graphical Statistics</italic></source>, <volume>28</volume>(<issue>4</issue>): <fpage>834</fpage>–<lpage>846</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Higham</surname> <given-names>NJ</given-names></string-name>, <string-name><surname>Strabić</surname> <given-names>N</given-names></string-name> (<year>2016</year>). <article-title>Anderson acceleration of the alternating projections method for computing the nearest correlation matrix</article-title>. <source><italic>Numerical Algorithms</italic></source>, <volume>72</volume>(<issue>4</issue>): <fpage>1021</fpage>–<lpage>1042</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_025">
<mixed-citation publication-type="journal"> <string-name><surname>Hobolth</surname> <given-names>A</given-names></string-name>, <string-name><surname>Guo</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Kousholt</surname> <given-names>A</given-names></string-name>, <string-name><surname>Jensen</surname> <given-names>JL</given-names></string-name> (<year>2020</year>). <article-title>A unifying framework and comparison of algorithms for non-negative matrix factorisation</article-title>. <source><italic>International Statistical Review</italic></source>, <volume>88</volume>(<issue>1</issue>): <fpage>29</fpage>–<lpage>53</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Hunter</surname> <given-names>DR</given-names></string-name>, <string-name><surname>Lange</surname> <given-names>K</given-names></string-name> (<year>2004</year>). <article-title>A tutorial on MM algorithms</article-title>. <source><italic>The American Statistician</italic></source>, <volume>58</volume>(<issue>1</issue>): <fpage>30</fpage>–<lpage>37</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_027">
<mixed-citation publication-type="journal"> <string-name><surname>Jin</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Tam</surname> <given-names>OH</given-names></string-name>, <string-name><surname>Paniagua</surname> <given-names>E</given-names></string-name>, <string-name><surname>Hammell</surname> <given-names>M</given-names></string-name> (<year>2015</year>). <article-title>TEtranscripts: A package for including transposable elements in differential expression analysis of RNA-seq datasets</article-title>. <source><italic>Bioinformatics</italic></source>, <volume>31</volume>(<issue>22</issue>): <fpage>3593</fpage>–<lpage>3599</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_028">
<mixed-citation publication-type="journal"> <string-name><surname>Junkins</surname> <given-names>JL</given-names></string-name>, <string-name><surname>Bani Younes</surname> <given-names>A</given-names></string-name>, <string-name><surname>Woollands</surname> <given-names>RM</given-names></string-name>, <string-name><surname>Bai</surname> <given-names>X</given-names></string-name> (<year>2013</year>). <article-title>Picard iteration, chebyshev polynomials and chebyshev-picard methods: Application in astrodynamics</article-title>. <source><italic>The Journal of the Astronautical Sciences</italic></source>, <volume>60</volume>(<issue>3</issue>): <fpage>623</fpage>–<lpage>653</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_029">
<mixed-citation publication-type="journal"> <string-name><surname>Knight</surname> <given-names>PA</given-names></string-name> (<year>2008</year>). <article-title>The Sinkhorn–Knopp algorithm: Convergence and applications</article-title>. <source><italic>SIAM Journal on Matrix Analysis and Applications</italic></source>, <volume>30</volume>(<issue>1</issue>): <fpage>261</fpage>–<lpage>275</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_030">
<mixed-citation publication-type="journal"> <string-name><surname>Liu</surname> <given-names>C</given-names></string-name>, <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name> (<year>1995</year>). <article-title>ML estimation of the multivariate t distribution with unknown degrees of freedom</article-title>. <source><italic>Statistica Sinica</italic></source>, <volume>5</volume>: <fpage>19</fpage>–<lpage>39</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_031">
<mixed-citation publication-type="other"> <string-name><surname>Mena</surname> <given-names>G</given-names></string-name>, <string-name><surname>Belanger</surname> <given-names>D</given-names></string-name>, <string-name><surname>Linderman</surname> <given-names>S</given-names></string-name>, <string-name><surname>Snoek</surname> <given-names>J</given-names></string-name> (2018). Learning latent permutations with gumbel-sinkhorn networks. arXiv preprint: <uri>https://arxiv.org/abs/1802.08665</uri>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_032">
<mixed-citation publication-type="journal"> <string-name><surname>Nesterov</surname> <given-names>Y</given-names></string-name> (<year>2013</year>). <article-title>Gradient methods for minimizing composite functions</article-title>. <source><italic>Mathematical Programming</italic></source>, <volume>140</volume>(<issue>1</issue>): <fpage>125</fpage>–<lpage>161</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_033">
<mixed-citation publication-type="journal"> <string-name><surname>O’Donoghue</surname> <given-names>B</given-names></string-name>, <string-name><surname>Candes</surname> <given-names>E</given-names></string-name> (<year>2015</year>). <article-title>Adaptive restart for accelerated gradient schemes</article-title>. <source><italic>Foundations of Computational Mathematics</italic></source>, <volume>15</volume>(<issue>3</issue>): <fpage>715</fpage>–<lpage>732</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_034">
<mixed-citation publication-type="journal"> <string-name><surname>Parlett</surname> <given-names>B</given-names></string-name>, <string-name><surname>Landis</surname> <given-names>TL</given-names></string-name> (<year>1982</year>). <article-title>Methods for scaling to doubly stochastic form</article-title>. <source><italic>Linear Algebra and its Applications</italic></source>, <volume>48</volume>: <fpage>53</fpage>–<lpage>79</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_035">
<mixed-citation publication-type="journal"> <string-name><surname>Raket</surname> <given-names>LL</given-names></string-name>, <string-name><surname>Grimme</surname> <given-names>B</given-names></string-name>, <string-name><surname>Schöner</surname> <given-names>G</given-names></string-name>, <string-name><surname>Igel</surname> <given-names>C</given-names></string-name>, <string-name><surname>Markussen</surname> <given-names>B</given-names></string-name> (<year>2016</year>). <article-title>Separating timing, movement conditions and individual differences in the analysis of human movement</article-title>. <source><italic>PLoS Computational Biology</italic></source>, <volume>12</volume>(<issue>9</issue>): <fpage>e1005092</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_036">
<mixed-citation publication-type="journal"> <string-name><surname>Raydan</surname> <given-names>M</given-names></string-name>, <string-name><surname>Svaiter</surname> <given-names>BF</given-names></string-name> (<year>2002</year>). <article-title>Relaxed steepest descent and cauchy-barzilai-borwein method</article-title>. <source><italic>Computational Optimization and Applications</italic></source>, <fpage>155</fpage>–<lpage>167</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_037">
<mixed-citation publication-type="journal"> <string-name><surname>Shiraishi</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Tremmel</surname> <given-names>G</given-names></string-name>, <string-name><surname>Miyano</surname> <given-names>S</given-names></string-name>, <string-name><surname>Stephens</surname> <given-names>M</given-names></string-name> (<year>2015</year>). <article-title>A simple model-based approach to inferring and visualizing cancer mutation signatures</article-title>. <source><italic>PLoS genetics</italic></source>, <volume>11</volume>(<issue>12</issue>): <fpage>e1005657</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_038">
<mixed-citation publication-type="journal"> <string-name><surname>Sinkhorn</surname> <given-names>R</given-names></string-name>, <string-name><surname>Knopp</surname> <given-names>P</given-names></string-name> (<year>1967</year>). <article-title>Concerning nonnegative matrices and doubly stochastic matrices</article-title>. <source><italic>Pacific Journal of Mathematics</italic></source>, <volume>21</volume>(<issue>2</issue>): <fpage>343</fpage>–<lpage>348</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_039">
<mixed-citation publication-type="journal"> <string-name><surname>Song</surname> <given-names>J</given-names></string-name>, <string-name><surname>Babu</surname> <given-names>P</given-names></string-name>, <string-name><surname>Palomar</surname> <given-names>DP</given-names></string-name> (<year>2016</year>). <article-title>Sequence set design with good correlation properties via majorization-minimization</article-title>. <source><italic>IEEE Transactions on Signal Processing</italic></source>, <volume>64</volume>(<issue>11</issue>): <fpage>2866</fpage>–<lpage>2879</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_040">
<mixed-citation publication-type="journal"> <string-name><surname>Tibshirani</surname> <given-names>R</given-names></string-name> (<year>1996</year>). <article-title>Regression shrinkage and selection via the lasso</article-title>. <source><italic>Journal of the Royal Statistical Society: Series B (Methodological)</italic></source>, <volume>58</volume>(<issue>1</issue>): <fpage>267</fpage>–<lpage>288</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_041">
<mixed-citation publication-type="journal"> <string-name><surname>Toth</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kelley</surname> <given-names>CT</given-names></string-name> (<year>2015</year>). <article-title>Convergence analysis for Anderson acceleration</article-title>. <source><italic>SIAM Journal on Numerical Analysis</italic></source>, <volume>53</volume>(<issue>2</issue>): <fpage>805</fpage>–<lpage>819</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_042">
<mixed-citation publication-type="other"> <string-name><surname>Tseng</surname> <given-names>P</given-names></string-name> (2009). On accelerated proximal gradient methods for convex-concave optimization. 2008, <uri>http://www.math.washington.edu/~tseng/papers/apgm.pdf</uri>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_043">
<mixed-citation publication-type="journal"> <string-name><surname>Van der Maaten</surname> <given-names>L</given-names></string-name>, <string-name><surname>Hinton</surname> <given-names>G</given-names></string-name> (<year>2008</year>). <article-title>Visualizing data using t-SNE</article-title>. <source><italic>Journal of Machine Learning Research</italic></source>, <volume>9</volume>(<issue>11</issue>).</mixed-citation>
</ref>
<ref id="j_jds1051_ref_044">
<mixed-citation publication-type="other"> <string-name><surname>Varadhan</surname> <given-names>R</given-names></string-name>, <string-name><surname>Roland</surname> <given-names>C</given-names></string-name> (2004). Squared extrapolation methods (SQUAREM): A new class of simple and efficient numerical schemes for accelerating the convergence of the EM algorithm. <italic>Department of Biostatistics Working Paper 63</italic>, Johns Hopkins University. <uri>https://biostats.bepress.com/jhubiostat/paper63/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_045">
<mixed-citation publication-type="journal"> <string-name><surname>Varadhan</surname> <given-names>R</given-names></string-name>, <string-name><surname>Roland</surname> <given-names>C</given-names></string-name> (<year>2008</year>). <article-title>Simple and globally convergent methods for accelerating the convergence of any EM algorithm</article-title>. <source><italic>Scandinavian Journal of Statistics</italic></source>, <volume>35</volume>(<issue>2</issue>): <fpage>335</fpage>–<lpage>353</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_046">
<mixed-citation publication-type="journal"> <string-name><surname>Walker</surname> <given-names>HF</given-names></string-name>, <string-name><surname>Ni</surname> <given-names>P</given-names></string-name> (<year>2011</year>). <article-title>Anderson acceleration for fixed-point iterations</article-title>. <source><italic>SIAM Journal on Numerical Analysis</italic></source>, <volume>49</volume>(<issue>4</issue>): <fpage>1715</fpage>–<lpage>1735</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_047">
<mixed-citation publication-type="chapter"> <string-name><surname>Yang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Peltonen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Kaski</surname> <given-names>S</given-names></string-name> (<year>2015</year>). <chapter-title>Majorization-minimization for manifold embedding</chapter-title>. In: <source><italic>Artificial Intelligence and Statistics</italic></source>, <fpage>1088</fpage>–<lpage>1097</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_048">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>O’Donoghue</surname> <given-names>B</given-names></string-name>, <string-name><surname>Boyd</surname> <given-names>S</given-names></string-name> (<year>2020</year>). <article-title>Globally convergent type-I Anderson acceleration for nonsmooth fixed-point iterations</article-title>. <source><italic>SIAM Journal on Optimization</italic></source>, <volume>30</volume>(<issue>4</issue>): <fpage>3170</fpage>–<lpage>3197</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_049">
<mixed-citation publication-type="journal"> <string-name><surname>Zhou</surname> <given-names>H</given-names></string-name>, <string-name><surname>Alexander</surname> <given-names>D</given-names></string-name>, <string-name><surname>Lange</surname> <given-names>K</given-names></string-name> (<year>2011</year>). <article-title>A quasi-Newton acceleration for high-dimensional optimization algorithms</article-title>. <source><italic>Statistics and Computing</italic></source>, <volume>21</volume>(<issue>2</issue>): <fpage>261</fpage>–<lpage>273</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1051_ref_050">
<mixed-citation publication-type="journal"> <string-name><surname>Zhu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Stephens</surname> <given-names>M</given-names></string-name> (<year>2018</year>). <article-title>Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes</article-title>. <source><italic>Nature Communications</italic></source>, <volume>9</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>14</lpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
