<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1093</article-id>
<article-id pub-id-type="doi">10.6339/23-JDS1093</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>Neural Generalized Ordinary Differential Equations with Layer-Varying Parameters</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Yu</surname><given-names>Duo</given-names></name><xref ref-type="aff" rid="j_jds1093_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Miao</surname><given-names>Hongyu</given-names></name><xref ref-type="aff" rid="j_jds1093_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Wu</surname><given-names>Hulin</given-names></name><email xlink:href="mailto:hulin.wu@uth.tmc.edu">hulin.wu@uth.tmc.edu</email><xref ref-type="aff" rid="j_jds1093_aff_003">3</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1093_aff_001"><label>1</label>Department of Population Health, <institution>The University of Texas at Austin</institution>, <country>United States</country></aff>
<aff id="j_jds1093_aff_002"><label>2</label>College of Nursing, <institution>Florida State University</institution>, <country>United States</country></aff>
<aff id="j_jds1093_aff_003"><label>3</label>Department of Biostatistics and Data Science, <institution>The University of Texas Health Science Center at Houston</institution>, <country>United States</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:hulin.wu@uth.tmc.edu">hulin.wu@uth.tmc.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2024</year></pub-date><pub-date pub-type="epub"><day>23</day><month>2</month><year>2023</year></pub-date><volume>22</volume><issue>1</issue><fpage>10</fpage><lpage>24</lpage><supplementary-material id="S1" content-type="document" xlink:href="jds1093_s001.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<title>Supplementary Material</title>
<p>Programming code to reproduce our results and figures can be found at <uri>https://github.com/Duo-Yu/Neural-GODE</uri>. In the Supplementary Material, we list the code directories and corresponding results.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>15</day><month>12</month><year>2022</year></date><date date-type="accepted"><day>19</day><month>2</month><year>2023</year></date></history>
<permissions><copyright-statement>2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2024</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Deep residual networks (ResNets) have shown state-of-the-art performance in various real-world applications. Recently, the ResNets model was reparameterized and interpreted as solutions to a continuous ordinary differential equation or Neural-ODE model. In this study, we propose a neural generalized ordinary differential equation (Neural-GODE) model with layer-varying parameters to further extend the Neural-ODE to approximate the discrete ResNets. Specifically, we use nonparametric B-spline functions to parameterize the Neural-GODE so that the trade-off between the model complexity and computational efficiency can be easily balanced. It is demonstrated that ResNets and Neural-ODE models are special cases of the proposed Neural-GODE model. Based on two benchmark datasets, MNIST and CIFAR-10, we show that the layer-varying Neural-GODE is more flexible and general than the standard Neural-ODE. Furthermore, the Neural-GODE enjoys the computational and memory benefits while performing comparably to ResNets in prediction accuracy.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>B-splines</kwd>
<kwd>deep residual networks</kwd>
<kwd>neural ordinary differential equations</kwd>
</kwd-group>
<funding-group><funding-statement>This work was supported in part by NIH grant R01 AI087135 and P03AI161943 (HW), grant from Cancer Prevention and Research Institute of Texas (PR170668) (HW), grant NSF/ECCS 2133106 (HM), and NSF/DMS 1620957 (HM).</funding-statement></funding-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1093_reflist_001">
<title>References</title>
<ref id="j_jds1093_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Abdeltawab</surname> <given-names>H</given-names></string-name>, <string-name><surname>Shehata</surname> <given-names>M</given-names></string-name>, <string-name><surname>Shalaby</surname> <given-names>A</given-names></string-name>, <string-name><surname>Khalifa</surname> <given-names>F</given-names></string-name>, <string-name><surname>Mahmoud</surname> <given-names>A</given-names></string-name>, <string-name><surname>El-Ghar</surname> <given-names>MA</given-names></string-name>, <etal>et al.</etal> (<year>2019</year>). <article-title>A novel cnn-based cad system for early assessment of transplanted kidney dysfunction</article-title>. <source><italic>Scientific Reports</italic></source>, <volume>9</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>11</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/s41598-018-37186-2" xlink:type="simple">https://doi.org/10.1038/s41598-018-37186-2</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_002">
<mixed-citation publication-type="book"> <string-name><surname>Arnold</surname> <given-names>VI</given-names></string-name> (<year>2012</year>). <source><italic>Geometrical Methods in the Theory of Ordinary Differential Equations</italic></source>. <publisher-name>Springer Science &amp; Business Media</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_003">
<mixed-citation publication-type="other"> <string-name><surname>Bahdanau</surname> <given-names>D</given-names></string-name>, <string-name><surname>Cho</surname> <given-names>K</given-names></string-name>, <string-name><surname>Bengio</surname> <given-names>Y</given-names></string-name> (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint <uri>https://arxiv.org/abs/1409.0473</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Bai</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kolter</surname> <given-names>JZ</given-names></string-name>, <string-name><surname>Koltun</surname> <given-names>V</given-names></string-name> (<year>2019</year>). <article-title>Deep equilibrium models</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>32</volume>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_005">
<mixed-citation publication-type="book"> <string-name><surname>Bartels</surname> <given-names>RH</given-names></string-name>, <string-name><surname>Beatty</surname> <given-names>JC</given-names></string-name>, <string-name><surname>Barsky</surname> <given-names>BA</given-names></string-name> (<year>1995</year>). <source><italic>An Introduction to Splines for Use in Computer Graphics and Geometric Modeling</italic></source>. <publisher-name>Morgan Kaufmann</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_006">
<mixed-citation publication-type="book"> <string-name><surname>Bishop</surname> <given-names>CM</given-names></string-name>, <etal>et al.</etal> (<year>1995</year>). <source><italic>Neural Networks for Pattern Recognition</italic></source>. <publisher-name>Oxford University Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_007">
<mixed-citation publication-type="other"> <string-name><surname>Chang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>M</given-names></string-name>, <string-name><surname>Haber</surname> <given-names>E</given-names></string-name>, <string-name><surname>Chi</surname> <given-names>EH</given-names></string-name> (2019). Antisymmetricrnn: A dynamical system view on recurrent neural networks. arXiv preprint <uri>https://arxiv.org/abs/1902.09689</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Chang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Meng</surname> <given-names>L</given-names></string-name>, <string-name><surname>Haber</surname> <given-names>E</given-names></string-name>, <string-name><surname>Ruthotto</surname> <given-names>L</given-names></string-name>, <string-name><surname>Begert</surname> <given-names>D</given-names></string-name>, <string-name><surname>Holtham</surname> <given-names>E</given-names></string-name> (<year>2018</year>). <article-title>Reversible architectures for arbitrarily deep residual neural networks</article-title>. <source><italic>Proceedings of the AAAI Conference on Artificial Intelligence</italic></source>, <volume>32</volume>(<issue>1</issue>).</mixed-citation>
</ref>
<ref id="j_jds1093_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Chen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>H</given-names></string-name> (<year>2008</year>). <article-title>Efficient local estimation for time-varying coefficients in deterministic dynamic models with applications to hiv-1 dynamics</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>103</volume>(<issue>481</issue>): <fpage>369</fpage>–<lpage>384</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1198/016214507000001382" xlink:type="simple">https://doi.org/10.1198/016214507000001382</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Chen</surname> <given-names>RT</given-names></string-name>, <string-name><surname>Rubanova</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Bettencourt</surname> <given-names>J</given-names></string-name>, <string-name><surname>Duvenaud</surname> <given-names>DK</given-names></string-name> (<year>2018</year>). <article-title>Neural ordinary differential equations</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>31</volume>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/978-3-030-04167-0" xlink:type="simple">https://doi.org/10.1007/978-3-030-04167-0</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_011">
<mixed-citation publication-type="other"> <string-name><surname>Chen</surname> <given-names>RTQ</given-names></string-name> (2018). torchdiffeq. <uri>https://github.com/rtqichen/torchdiffeq</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_012">
<mixed-citation publication-type="other"> <string-name><surname>Cranmer</surname> <given-names>M</given-names></string-name>, <string-name><surname>Greydanus</surname> <given-names>S</given-names></string-name>, <string-name><surname>Hoyer</surname> <given-names>S</given-names></string-name>, <string-name><surname>Battaglia</surname> <given-names>P</given-names></string-name>, <string-name><surname>Spergel</surname> <given-names>D</given-names></string-name>, <string-name><surname>Ho</surname> <given-names>S</given-names></string-name> (2020). Lagrangian neural networks. arXiv preprint <uri>https://arxiv.org/abs/2003.04630</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Dupont</surname> <given-names>E</given-names></string-name>, <string-name><surname>Doucet</surname> <given-names>A</given-names></string-name>, <string-name><surname>Teh</surname> <given-names>YW</given-names></string-name> (<year>2019</year>). <article-title>Augmented neural odes</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>32</volume>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Esteva</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kuprel</surname> <given-names>B</given-names></string-name>, <string-name><surname>Novoa</surname> <given-names>RA</given-names></string-name>, <string-name><surname>Ko</surname> <given-names>J</given-names></string-name>, <string-name><surname>Swetter</surname> <given-names>SM</given-names></string-name>, <string-name><surname>Blau</surname> <given-names>HM</given-names></string-name>, <etal>et al.</etal> (<year>2017</year>). <article-title>Dermatologist-level classification of skin cancer with deep neural networks</article-title>. <source><italic>Nature</italic></source>, <volume>542</volume>(<issue>7639</issue>): <fpage>115</fpage>–<lpage>118</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/nature21056" xlink:type="simple">https://doi.org/10.1038/nature21056</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Goodfellow</surname> <given-names>I</given-names></string-name>, <string-name><surname>Pouget-Abadie</surname> <given-names>J</given-names></string-name>, <string-name><surname>Mirza</surname> <given-names>M</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>B</given-names></string-name>, <string-name><surname>Warde-Farley</surname> <given-names>D</given-names></string-name>, <string-name><surname>Ozair</surname> <given-names>S</given-names></string-name>, <etal>et al.</etal> (<year>2014</year>). <article-title>Generative adversarial nets</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>27</volume>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_016">
<mixed-citation publication-type="chapter"> <string-name><surname>Graves</surname> <given-names>A</given-names></string-name>, <string-name><surname>Mohamed</surname> <given-names>Ar</given-names></string-name> <string-name><surname>Hinton</surname> <given-names>G</given-names></string-name> (<year>2013</year>). <chapter-title>Speech recognition with deep recurrent neural networks</chapter-title>. In: <source><italic>2013 IEEE International Conference on Acoustics, Speech And Signal Processing</italic></source>. <fpage>6645</fpage>–<lpage>6649</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Greydanus</surname> <given-names>S</given-names></string-name>, <string-name><surname>Dzamba</surname> <given-names>M</given-names></string-name>, <string-name><surname>Yosinski</surname> <given-names>J</given-names></string-name> (<year>2019</year>). <article-title>Hamiltonian neural networks</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>32</volume>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_018">
<mixed-citation publication-type="other"> <string-name><surname>Günther</surname> <given-names>S</given-names></string-name>, <string-name><surname>Pazner</surname> <given-names>W</given-names></string-name>, <string-name><surname>Qi</surname> <given-names>D</given-names></string-name> (2021). Spline parameterization of neural network controls for deep learning. arXiv preprint <uri>https://arxiv.org/abs/2103.00301</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Haber</surname> <given-names>E</given-names></string-name>, <string-name><surname>Ruthotto</surname> <given-names>L</given-names></string-name> (<year>2017</year>). <article-title>Stable architectures for deep neural networks</article-title>. <source><italic>Inverse Problems</italic></source>, <volume>34</volume>(<issue>1</issue>): <fpage>014004</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1088/1361-6420/aa9a90" xlink:type="simple">https://doi.org/10.1088/1361-6420/aa9a90</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_020">
<mixed-citation publication-type="chapter"> <string-name><surname>He</surname> <given-names>K</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ren</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>J</given-names></string-name> (<year>2016</year>a). <chapter-title>Deep residual learning for image recognition</chapter-title>. In: <source><italic>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</italic></source>. <fpage>770</fpage>–<lpage>778</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_021">
<mixed-citation publication-type="chapter"> <string-name><surname>He</surname> <given-names>K</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ren</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>J</given-names></string-name> (<year>2016</year>b). <chapter-title>Identity mappings in deep residual networks</chapter-title>. In: <source><italic>European Conference on Computer Vision</italic></source>. <fpage>630</fpage>–<lpage>645</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Hornik</surname> <given-names>K</given-names></string-name>, <string-name><surname>Stinchcombe</surname> <given-names>M</given-names></string-name>, <string-name><surname>White</surname> <given-names>H</given-names></string-name> (<year>1989</year>). <article-title>Multilayer feedforward networks are universal approximators</article-title>. <source><italic>Neural Networks</italic></source>, <volume>2</volume>(<issue>5</issue>): <fpage>359</fpage>–<lpage>366</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/0893-6080(89)90020-8" xlink:type="simple">https://doi.org/10.1016/0893-6080(89)90020-8</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_023">
<mixed-citation publication-type="chapter"> <string-name><surname>Ioffe</surname> <given-names>S</given-names></string-name>, <string-name><surname>Szegedy</surname> <given-names>C</given-names></string-name> (<year>2015</year>). <chapter-title>Batch normalization: Accelerating deep network training by reducing internal covariate shift</chapter-title>. In: <source><italic>International Conference on Machine Learning</italic></source>. <fpage>448</fpage>–<lpage>456</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Kim</surname> <given-names>K</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>S</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>YH</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>SH</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>HS</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>S</given-names></string-name> (<year>2018</year>). <article-title>Performance of the deep convolutional neural network based magnetic resonance image scoring algorithm for differentiating between tuberculous and pyogenic spondylitis</article-title>. <source><italic>Scientific Reports</italic></source>, <volume>8</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>10</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/s41598-018-35713-9" xlink:type="simple">https://doi.org/10.1038/s41598-018-35713-9</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_025">
<mixed-citation publication-type="other"> <string-name><surname>Krizhevsky</surname> <given-names>A</given-names></string-name>, <string-name><surname>Hinton</surname> <given-names>G</given-names></string-name>, et al. (2009). Learning multiple layers of features from tiny images, Master’s Thesis, University of Tront.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>LaSalle</surname> <given-names>JP</given-names></string-name> (<year>1968</year>). <article-title>Stability theory for ordinary differential equations</article-title>. <source><italic>Journal of Differential Equations</italic></source>, <volume>4</volume>(<issue>1</issue>): <fpage>57</fpage>–<lpage>65</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/0022-0396(68)90048-X" xlink:type="simple">https://doi.org/10.1016/0022-0396(68)90048-X</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_027">
<mixed-citation publication-type="journal"> <string-name><surname>LeCun</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Bottou</surname> <given-names>L</given-names></string-name>, <string-name><surname>Bengio</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Haffner</surname> <given-names>P</given-names></string-name> (<year>1998</year>). <article-title>Gradient-based learning applied to document recognition</article-title>. <source><italic>Proceedings of the IEEE</italic></source>, <volume>86</volume>(<issue>11</issue>): <fpage>2278</fpage>–<lpage>2324</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/5.726791" xlink:type="simple">https://doi.org/10.1109/5.726791</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_028">
<mixed-citation publication-type="other"> <string-name><surname>Li</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>L</given-names></string-name>, <string-name><surname>Tai</surname> <given-names>C</given-names></string-name>, et al. (2017). Maximum principle based algorithms for deep learning. arXiv preprint <uri>https://arxiv.org/abs/1710.09513</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_029">
<mixed-citation publication-type="other"> <string-name><surname>Li</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>T</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>Z</given-names></string-name> (2019). Deep learning via dynamical systems: An approximation perspective. arXiv preprint <uri>https://arxiv.org/abs/1912.10382</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_030">
<mixed-citation publication-type="journal"> <string-name><surname>Liang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Miao</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>H</given-names></string-name> (<year>2010</year>). <article-title>Estimation of constant and time-varying dynamic parameters of hiv infection in a nonlinear differential equation model</article-title>. <source><italic>The Annals of Applied Statistics</italic></source>, <volume>4</volume>(<issue>1</issue>): <fpage>460</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/09-AOAS290" xlink:type="simple">https://doi.org/10.1214/09-AOAS290</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_031">
<mixed-citation publication-type="journal"> <string-name><surname>Lim</surname> <given-names>SH</given-names></string-name> (<year>2021</year>). <article-title>Understanding recurrent neural networks using nonequilibrium response theory</article-title>. <source><italic>Journal of Machine Learning Research</italic></source>, <volume>22</volume>: <fpage>1</fpage>–<lpage>47</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_032">
<mixed-citation publication-type="journal"> <string-name><surname>Lim</surname> <given-names>SH</given-names></string-name>, <string-name><surname>Erichson</surname> <given-names>NB</given-names></string-name>, <string-name><surname>Hodgkinson</surname> <given-names>L</given-names></string-name>, <string-name><surname>Mahoney</surname> <given-names>MW</given-names></string-name> (<year>2021</year>). <article-title>Noisy recurrent neural networks</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>34</volume>: <fpage>5124</fpage>–<lpage>5137</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_033">
<mixed-citation publication-type="chapter"> <string-name><surname>Long</surname> <given-names>J</given-names></string-name>, <string-name><surname>Shelhamer</surname> <given-names>E</given-names></string-name>, <string-name><surname>Darrell</surname> <given-names>T</given-names></string-name> (<year>2015</year>). <chapter-title>Fully convolutional networks for semantic segmentation</chapter-title>. In: <source><italic>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</italic></source>. <fpage>3431</fpage>–<lpage>3440</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_034">
<mixed-citation publication-type="chapter"> <string-name><surname>Lu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhong</surname> <given-names>A</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Dong</surname> <given-names>B</given-names></string-name> (<year>2018</year>). <chapter-title>Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations</chapter-title>. In: <source><italic>International Conference on Machine Learning</italic></source>. <fpage>3276</fpage>–<lpage>3285</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_035">
<mixed-citation publication-type="journal"> <string-name><surname>Massaroli</surname> <given-names>S</given-names></string-name>, <string-name><surname>Poli</surname> <given-names>M</given-names></string-name>, <string-name><surname>Park</surname> <given-names>J</given-names></string-name>, <string-name><surname>Yamashita</surname> <given-names>A</given-names></string-name>, <string-name><surname>Asama</surname> <given-names>H</given-names></string-name> (<year>2020</year>). <article-title>Dissecting neural odes</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>33</volume>: <fpage>3952</fpage>–<lpage>3963</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_036">
<mixed-citation publication-type="journal"> <string-name><surname>Miao</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Xue</surname> <given-names>H</given-names></string-name> (<year>2014</year>). <article-title>Generalized ordinary differential equation models</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>109</volume>(<issue>508</issue>): <fpage>1672</fpage>–<lpage>1682</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.2014.957287" xlink:type="simple">https://doi.org/10.1080/01621459.2014.957287</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_037">
<mixed-citation publication-type="journal"> <string-name><surname>Noda</surname> <given-names>K</given-names></string-name>, <string-name><surname>Yamaguchi</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Nakadai</surname> <given-names>K</given-names></string-name>, <string-name><surname>Okuno</surname> <given-names>HG</given-names></string-name>, <string-name><surname>Ogata</surname> <given-names>T</given-names></string-name> (<year>2015</year>). <article-title>Audio-visual speech recognition using deep learning</article-title>. <source><italic>Applied Intelligence</italic></source>, <volume>42</volume>(<issue>4</issue>): <fpage>722</fpage>–<lpage>737</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s10489-014-0629-7" xlink:type="simple">https://doi.org/10.1007/s10489-014-0629-7</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_038">
<mixed-citation publication-type="journal"> <string-name><surname>Perperoglou</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sauerbrei</surname> <given-names>W</given-names></string-name>, <string-name><surname>Abrahamowicz</surname> <given-names>M</given-names></string-name>, <string-name><surname>Schmid</surname> <given-names>M</given-names></string-name> (<year>2019</year>). <article-title>A review of spline function procedures in r</article-title>. <source><italic>BMC Medical Research Methodology</italic></source>, <volume>19</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>16</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1186/s12874-018-0650-3" xlink:type="simple">https://doi.org/10.1186/s12874-018-0650-3</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_039">
<mixed-citation publication-type="chapter"> <string-name><surname>Qiu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Yao</surname> <given-names>T</given-names></string-name>, <string-name><surname>Mei</surname> <given-names>T</given-names></string-name> (<year>2017</year>). <chapter-title>Learning spatio-temporal representation with pseudo-3d residual networks</chapter-title>. In: <source><italic>Proceedings of the IEEE International Conference on Computer Vision</italic></source>. <fpage>5533</fpage>–<lpage>5541</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_040">
<mixed-citation publication-type="journal"> <string-name><surname>Queiruga</surname> <given-names>A</given-names></string-name>, <string-name><surname>Erichson</surname> <given-names>NB</given-names></string-name>, <string-name><surname>Hodgkinson</surname> <given-names>L</given-names></string-name>, <string-name><surname>Mahoney</surname> <given-names>MW</given-names></string-name> (<year>2021</year>). <article-title>Stateful ode-nets using basis function expansions</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>34</volume>: <fpage>21770</fpage>–<lpage>21781</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_041">
<mixed-citation publication-type="other"> <string-name><surname>Queiruga</surname> <given-names>AF</given-names></string-name>, <string-name><surname>Erichson</surname> <given-names>NB</given-names></string-name>, <string-name><surname>Taylor</surname> <given-names>D</given-names></string-name>, <string-name><surname>Mahoney</surname> <given-names>MW</given-names></string-name> (2020). Continuous-in-depth neural networks. arXiv preprint <uri>https://arxiv.org/abs/2008.02389</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_042">
<mixed-citation publication-type="book"> <string-name><surname>Ripley</surname> <given-names>BD</given-names></string-name> (<year>2007</year>). <source><italic>Pattern Recognition and Neural Networks</italic></source>. <publisher-name>Cambridge University Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_043">
<mixed-citation publication-type="other"> <string-name><surname>Rusch</surname> <given-names>TK</given-names></string-name>, <string-name><surname>Mishra</surname> <given-names>S</given-names></string-name> (2020). Coupled oscillatory recurrent neural network (cornn): An accurate and (gradient) stable architecture for learning long time dependencies. arXiv preprint <uri>https://arxiv.org/abs/2010.00951</uri>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_044">
<mixed-citation publication-type="chapter"> <string-name><surname>Rusch</surname> <given-names>TK</given-names></string-name>, <string-name><surname>Mishra</surname> <given-names>S</given-names></string-name> (<year>2021</year>). <chapter-title>Unicornn: A recurrent model for learning very long time dependencies</chapter-title>. In: <source><italic>International Conference on Machine Learning</italic></source>, <fpage>9168</fpage>–<lpage>9178</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_045">
<mixed-citation publication-type="journal"> <string-name><surname>Ruthotto</surname> <given-names>L</given-names></string-name>, <string-name><surname>Haber</surname> <given-names>E</given-names></string-name> (<year>2020</year>). <article-title>Deep neural networks motivated by partial differential equations</article-title>. <source><italic>Journal of Mathematical Imaging and Vision</italic></source>, <volume>62</volume>(<issue>3</issue>): <fpage>352</fpage>–<lpage>364</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s10851-019-00903-1" xlink:type="simple">https://doi.org/10.1007/s10851-019-00903-1</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_046">
<mixed-citation publication-type="journal"> <string-name><surname>Sigaki</surname> <given-names>HY</given-names></string-name>, <string-name><surname>Lenzi</surname> <given-names>EK</given-names></string-name>, <string-name><surname>Zola</surname> <given-names>RS</given-names></string-name>, <string-name><surname>Perc</surname> <given-names>M</given-names></string-name>, <string-name><surname>Ribeiro</surname> <given-names>HV</given-names></string-name> (<year>2020</year>). <article-title>Learning physical properties of liquid crystals with deep convolutional neural networks</article-title>. <source><italic>Scientific Reports</italic></source>, <volume>10</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>10</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/s41598-019-56847-4" xlink:type="simple">https://doi.org/10.1038/s41598-019-56847-4</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_047">
<mixed-citation publication-type="journal"> <string-name><surname>Silver</surname> <given-names>D</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>A</given-names></string-name>, <string-name><surname>Maddison</surname> <given-names>CJ</given-names></string-name>, <string-name><surname>Guez</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sifre</surname> <given-names>L</given-names></string-name>, <string-name><surname>Van Den Driessche</surname> <given-names>G</given-names></string-name>, <etal>et al.</etal> (<year>2016</year>). <article-title>Mastering the game of go with deep neural networks and tree search</article-title>. <source><italic>Nature</italic></source>, <volume>529</volume>(<issue>7587</issue>): <fpage>484</fpage>–<lpage>489</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/nature16961" xlink:type="simple">https://doi.org/10.1038/nature16961</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_048">
<mixed-citation publication-type="journal"> <string-name><surname>Silver</surname> <given-names>D</given-names></string-name>, <string-name><surname>Schrittwieser</surname> <given-names>J</given-names></string-name>, <string-name><surname>Simonyan</surname> <given-names>K</given-names></string-name>, <string-name><surname>Antonoglou</surname> <given-names>I</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>A</given-names></string-name>, <string-name><surname>Guez</surname> <given-names>A</given-names></string-name>, <etal>et al.</etal> (<year>2017</year>). <article-title>Mastering the game of go without human knowledge</article-title>. <source><italic>Nature</italic></source>, <volume>550</volume>(<issue>7676</issue>): <fpage>354</fpage>–<lpage>359</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/nature24270" xlink:type="simple">https://doi.org/10.1038/nature24270</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_049">
<mixed-citation publication-type="book"> <string-name><surname>Simmons</surname> <given-names>GF</given-names></string-name> (<year>2016</year>). <source><italic>Differential Equations with Applications and Historical Notes</italic></source>. <publisher-name>CRC Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_050">
<mixed-citation publication-type="journal"> <string-name><surname>Tang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>A</given-names></string-name>, <string-name><surname>Xiao</surname> <given-names>Y</given-names></string-name> (<year>2015</year>). <article-title>Holling ii predator–prey impulsive semi-dynamic model with complex poincaré map</article-title>. <source><italic>Nonlinear Dynamics</italic></source>, <volume>81</volume>(<issue>3</issue>): <fpage>1575</fpage>–<lpage>1596</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s11071-015-2092-3" xlink:type="simple">https://doi.org/10.1007/s11071-015-2092-3</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_051">
<mixed-citation publication-type="journal"> <string-name><surname>Weinan</surname> <given-names>E</given-names></string-name> (<year>2017</year>). <article-title>A proposal on machine learning via dynamical systems</article-title>. <source><italic>Communications in Mathematics and Statistics</italic></source>, <volume>1</volume>(<issue>5</issue>): <fpage>1</fpage>–<lpage>11</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_052">
<mixed-citation publication-type="chapter"> <string-name><surname>Wu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>He</surname> <given-names>K</given-names></string-name> (<year>2018</year>). <chapter-title>Group normalization</chapter-title>. In: <source><italic>Proceedings of the European Conference on Computer Vision (ECCV)</italic></source>. <fpage>3</fpage>–<lpage>19</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_053">
<mixed-citation publication-type="journal"> <string-name><surname>Xue</surname> <given-names>H</given-names></string-name>, <string-name><surname>Miao</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>H</given-names></string-name> (<year>2010</year>). <article-title>Sieve estimation of constant and time-varying coefficients in nonlinear ordinary differential equation models by considering both numerical error and measurement error</article-title>. <source><italic>Annals of Statistics</italic></source>, <volume>38</volume>(<issue>4</issue>): <fpage>2351</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/09-AOS784" xlink:type="simple">https://doi.org/10.1214/09-AOS784</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_054">
<mixed-citation publication-type="journal"> <string-name><surname>Young</surname> <given-names>T</given-names></string-name>, <string-name><surname>Hazarika</surname> <given-names>D</given-names></string-name>, <string-name><surname>Poria</surname> <given-names>S</given-names></string-name>, <string-name><surname>Cambria</surname> <given-names>E</given-names></string-name> (<year>2018</year>). <article-title>Recent trends in deep learning based natural language processing</article-title>. <source><italic>IEEE Computational Intelligence Magazine</italic></source>, <volume>13</volume>(<issue>3</issue>): <fpage>55</fpage>–<lpage>75</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/MCI.2018.2840738" xlink:type="simple">https://doi.org/10.1109/MCI.2018.2840738</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_055">
<mixed-citation publication-type="book"> <string-name><surname>Yu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Deng</surname> <given-names>L</given-names></string-name> (<year>2016</year>). <source><italic>Automatic Speech Recognition</italic></source>, volume <volume>1</volume>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_056">
<mixed-citation publication-type="journal"> <string-name><surname>Yu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Chiu</surname> <given-names>AP</given-names></string-name>, <string-name><surname>He</surname> <given-names>D</given-names></string-name> (<year>2017</year>). <article-title>Effects of reactive social distancing on the 1918 influenza pandemic</article-title>. <source><italic>PloS One</italic></source>, <volume>12</volume>(<issue>7</issue>): <fpage>e0180545</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1371/journal.pone.0180545" xlink:type="simple">https://doi.org/10.1371/journal.pone.0180545</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_057">
<mixed-citation publication-type="journal"> <string-name><surname>Yu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Lou</surname> <given-names>Y</given-names></string-name> (<year>2016</year>). <article-title>Revisiting logistic population model for assessing periodically harvested closures</article-title>. <source><italic>Communications in Mathematical Biology and Neuroscience</italic></source>, <year>2016</year>: Article ID <elocation-id>14</elocation-id>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_058">
<mixed-citation publication-type="chapter"> <string-name><surname>Yu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Yaseen</surname> <given-names>A</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>X</given-names></string-name> (<year>2020</year>). <chapter-title>Neural network and deep learning methods for ehr data</chapter-title>. In: <source><italic>Statistics and Machine Learning Methods for EHR Data</italic></source> (<string-name><given-names>H</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>JM</given-names> <surname>Yamal</surname></string-name>, <string-name><given-names>A</given-names> <surname>Yaseen</surname></string-name>, <string-name><given-names>V</given-names> <surname>Maroufy</surname></string-name>, eds.), <fpage>253</fpage>–<lpage>271</lpage>. <publisher-name>Chapman and Hall/CRC</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_059">
<mixed-citation publication-type="journal"> <string-name><surname>Yu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>C</given-names></string-name>, <string-name><surname>Soltanalizadeh</surname> <given-names>B</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>X</given-names></string-name>, <etal>et al.</etal> (<year>2021</year>). <article-title>Assessing effects of reopening policies on COVID-19 pandemic in texas with a data-driven transmission model</article-title>. <source><italic>Infectious Disease Modelling</italic></source>, <volume>6</volume>: <fpage>461</fpage>–<lpage>473</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.idm.2021.02.001" xlink:type="simple">https://doi.org/10.1016/j.idm.2021.02.001</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_060">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>K</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>M</given-names></string-name>, <string-name><surname>Han</surname> <given-names>TX</given-names></string-name>, <string-name><surname>Yuan</surname> <given-names>X</given-names></string-name>, <string-name><surname>Guo</surname> <given-names>L</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>T</given-names></string-name> (<year>2017</year>). <article-title>Residual networks of residual networks: Multilevel residual networks</article-title>. <source><italic>IEEE Transactions on Circuits and Systems for Video Technology</italic></source>, <volume>28</volume>(<issue>6</issue>): <fpage>1303</fpage>–<lpage>1314</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/TCSVT.2017.2654543" xlink:type="simple">https://doi.org/10.1109/TCSVT.2017.2654543</ext-link></mixed-citation>
</ref>
<ref id="j_jds1093_ref_061">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>T</given-names></string-name>, <string-name><surname>Yao</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Gholami</surname> <given-names>A</given-names></string-name>, <string-name><surname>Gonzalez</surname> <given-names>JE</given-names></string-name>, <string-name><surname>Keutzer</surname> <given-names>K</given-names></string-name>, <string-name><surname>Mahoney</surname> <given-names>MW</given-names></string-name>, et al. (<year>2019</year>). <article-title>Anodev2: A coupled neural ode framework</article-title>. <source><italic>Advances in Neural Information Processing Systems</italic></source>, <volume>32</volume>.</mixed-citation>
</ref>
<ref id="j_jds1093_ref_062">
<mixed-citation publication-type="other"><uri>https://arxiv.org/abs/1909.12077</uri></mixed-citation>
</ref>
</ref-list>
</back>
</article>
