<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1025</article-id>
<article-id pub-id-type="doi">10.6339/21-JDS1025</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>Predictive Comparison Between Random Machines and Random Forests</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Maia</surname><given-names>Mateus</given-names></name><xref ref-type="aff" rid="j_jds1025_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Azevedo</surname><given-names>Arthur R.</given-names></name><xref ref-type="aff" rid="j_jds1025_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Ara</surname><given-names>Anderson</given-names></name><email xlink:href="mailto:alsouzara@gmail.com">alsouzara@gmail.com</email><xref ref-type="aff" rid="j_jds1025_aff_003">3</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1025_aff_001"><label>1</label>Department of Math &amp; Statistics, <institution>Maynooth University</institution>, Maynooth, <country>Ireland</country></aff>
<aff id="j_jds1025_aff_002"><label>2</label>Department of Statistics, <institution>Federal University of Bahia</institution>, Salvador-BA, <country>Brazil</country></aff>
<aff id="j_jds1025_aff_003"><label>3</label>Department of Statistics, <institution>Federal University of Paraná</institution>, Curitiba-PR, <country>Brazil</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:alsouzara@gmail.com">alsouzara@gmail.com</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2021</year></pub-date><pub-date pub-type="epub"><day>28</day><month>10</month><year>2021</year></pub-date><volume>19</volume><issue>4</issue><fpage>593</fpage><lpage>614</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1025_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material A</title>
<p>The RM was also implemented in R language and it can be used through the rmachines package, available and documented at GitHub <uri>https://github.com/MateusMaiaDS/rmachines</uri>. To a overall description of how to reproduce the results from this article just access the README at <uri>https://mateusmaiads.github.io/rmachines_and_randomforest/</uri>.</p>
</caption>
</supplementary-material><supplementary-material id="S2" content-type="document" xlink:href="jds1025_s002.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<title>Supplementary Material B</title>
<p>Exposes a descriptive analysis of the three real-world applications displayed in Section 5 and additional results around the comparison of RM and RF.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>5</day><month>8</month><year>2021</year></date><date date-type="accepted"><day>19</day><month>9</month><year>2021</year></date></history>
<permissions><copyright-statement>2021 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2021</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Ensemble techniques have been gaining strength among machine learning models, considering supervised tasks, due to their great predictive capacity when compared with some traditional approaches. The random forest is considered to be one of the off-the-shelf algorithms due to its flexibility and robust performance to both regression and classification tasks. In this paper, the random machines method is applied over simulated data sets and benchmarking datasets in order to be compared with the consolidated random forest models. The results from simulated models show that the random machines method has a better predictive performance than random forest in most of the investigated data sets. Three real data situations demonstrate that the random machines may be used to solve real-world problems with competitive payoff.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>bagging</kwd>
<kwd>ensemble</kwd>
<kwd>support vector machines</kwd>
</kwd-group>
<funding-group><award-group><funding-source xlink:href="https://doi.org/10.13039/501100002322">CAPES</funding-source></award-group><award-group><funding-source xlink:href="https://doi.org/10.13039/501100001602">Science Foundation Ireland</funding-source><award-id>17/CDA/4695</award-id></award-group><funding-statement>The authors gratefully acknowledge the financial support of the Brazilian research funding agencies CAPES (Federal Agency for the Support and Improvement of Higher Education). M.M.’s work was supported by a Science Foundation Ireland Career Development Award Grant 17/CDA/4695. </funding-statement></funding-group>
</article-meta>
</front>
<body/>
<back>
<ref-list id="j_jds1025_reflist_001">
<title>References</title>
<ref id="j_jds1025_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Al-Rajab</surname> <given-names>M</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>Q</given-names></string-name> (<year>2017</year>). <article-title>Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis</article-title>. <source>Computer Methods and Programs in Biomedicine</source>, <volume>146</volume>: <fpage>11</fpage>–<lpage>24</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_002">
<mixed-citation publication-type="other"> <string-name><surname>Td</surname> <given-names>A</given-names></string-name> (2017). O conceito de amor: um estudo exploratório com uma amostra brasileira, Ph.D. thesis, Universidade de São Paulo.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Ara</surname> <given-names>A</given-names></string-name>, <string-name><surname>Maia</surname> <given-names>M</given-names></string-name>, <string-name><surname>Louzada</surname> <given-names>F</given-names></string-name>, <string-name><surname>Macêdo</surname> <given-names>S</given-names></string-name> (<year>2021</year>). <article-title>Random machines: A bagged-weighted support vector model with free kernel choice</article-title>. <source>Journal of Data Science</source>, <volume>19</volume>(<issue>3</issue>): <fpage>409</fpage>–<lpage>428</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_004">
<mixed-citation publication-type="chapter"> <string-name><surname>Batuwita</surname> <given-names>R</given-names></string-name>, <string-name><surname>Palade</surname> <given-names>V</given-names></string-name> (<year>2013</year>). <chapter-title>Class imbalance learning methods for support vector machines</chapter-title>. In: <source>Imbalanced learning: Foundations, Algorithms, and Applications</source>, <fpage>83</fpage>–<lpage>99</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Bhavan</surname> <given-names>A</given-names></string-name>, <string-name><surname>Chauhan</surname> <given-names>P</given-names></string-name>, <string-name><surname>Shah</surname> <given-names>RR</given-names></string-name>, <etal>et al.</etal> (<year>2019</year>). <article-title>Bagged support vector machines for emotion recognition from speech</article-title>. <source>Knowledge-Based Systems</source>, <volume>184</volume>: <fpage>104886</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_006">
<mixed-citation publication-type="chapter"> <string-name><surname>Bosch</surname> <given-names>A</given-names></string-name>, <string-name><surname>Zisserman</surname> <given-names>A</given-names></string-name>, <string-name><surname>Munoz</surname> <given-names>X</given-names></string-name> (<year>2007</year>). <chapter-title>Image classification using random forests and ferns</chapter-title>. In: <source>2007 IEEE 11th International Conference on Computer Vision</source>, <fpage>1</fpage>–<lpage>8</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Boughorbel</surname> <given-names>S</given-names></string-name>, <string-name><surname>Jarray</surname> <given-names>F</given-names></string-name>, <string-name><surname>El-Anbari</surname> <given-names>M</given-names></string-name> (<year>2017</year>). <article-title>Optimal classifier for imbalanced data using Matthews correlation coefficient metric</article-title>. <source>PloS ONE</source>, <volume>12</volume>(<issue>6</issue>): <fpage>e0177678</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Breiman</surname> <given-names>L</given-names></string-name> (<year>1996</year>). <article-title>Bagging predictors</article-title>. <source>Machine Learning</source>, <volume>24</volume>(<issue>2</issue>): <fpage>123</fpage>–<lpage>140</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Breiman</surname> <given-names>L</given-names></string-name> (<year>2001</year>). <article-title>Random forests</article-title>. <source>Machine Learning</source>, <volume>45</volume>(<issue>1</issue>): <fpage>5</fpage>–<lpage>32</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_010">
<mixed-citation publication-type="other"> <string-name><surname>Breiman</surname> <given-names>L</given-names></string-name> (2002). Manual on setting up, using, and understanding random forests v3.1. Statistics Department University of California. Berkeley, CA, USA, 1:58.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Breiman</surname> <given-names>L</given-names></string-name>, <etal>et al.</etal> (<year>1996</year>). <article-title>Heuristics of instability and stabilization in model selection</article-title>. <source>The Annals of Statistics</source>, <volume>24</volume>(<issue>6</issue>): <fpage>2350</fpage>–<lpage>2383</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_012">
<mixed-citation publication-type="other"> <string-name><surname>Callado</surname> <given-names>ALC</given-names></string-name> (2003). Estudo sobre insolvência entre empresas paraibanas: uma aplicação do termômetro de kanitz. <italic>Anais do Encontro Nordestino de Contabilidade–ENECON</italic>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Cortes</surname> <given-names>C</given-names></string-name>, <string-name><surname>Vapnik</surname> <given-names>V</given-names></string-name> (<year>1995</year>). <article-title>Support-vector networks</article-title>. <source>Machine Learning</source>, <volume>20</volume>(<issue>3</issue>): <fpage>273</fpage>–<lpage>297</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_014">
<mixed-citation publication-type="chapter"> <string-name><surname>Dietterich</surname> <given-names>TG</given-names></string-name> (<year>2000</year>). <chapter-title>Ensemble methods in machine learning</chapter-title>. In: <source>International Workshop on Multiple Classifier Systems</source>, <fpage>1</fpage>–<lpage>15</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_015">
<mixed-citation publication-type="chapter"> <string-name><surname>Drucker</surname> <given-names>H</given-names></string-name>, <string-name><surname>Burges</surname> <given-names>CJ</given-names></string-name>, <string-name><surname>Kaufman</surname> <given-names>L</given-names></string-name>, <string-name><surname>Smola</surname> <given-names>AJ</given-names></string-name>, <string-name><surname>Vapnik</surname> <given-names>V</given-names></string-name> (<year>1997</year>). <chapter-title>Support vector regression machines</chapter-title>. In: <source>Advances in Neural Information Processing Systems</source>, <fpage>155</fpage>–<lpage>161</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_016">
<mixed-citation publication-type="other"> <string-name><surname>Dua</surname> <given-names>D</given-names></string-name>, <string-name><surname>Graff</surname> <given-names>C</given-names></string-name> (2017). UCI machine learning repository.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Fernández-Delgado</surname> <given-names>M</given-names></string-name>, <string-name><surname>Cernadas</surname> <given-names>E</given-names></string-name>, <string-name><surname>Barro</surname> <given-names>S</given-names></string-name>, <string-name><surname>Amorim</surname> <given-names>D</given-names></string-name> (<year>2014</year>). <article-title>Do we need hundreds of classifiers to solve real world classification problems?</article-title> <source>The Journal of Machine Learning Research</source>, <volume>15</volume>(<issue>1</issue>): <fpage>3133</fpage>–<lpage>3181</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_018">
<mixed-citation publication-type="book"> <string-name><surname>Fletcher</surname> <given-names>R</given-names></string-name> (<year>2013</year>). <source>Practical Methods of Optimization</source>. <publisher-name>John Wiley &amp; Sons</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Freund</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Schapire</surname> <given-names>RE</given-names></string-name> (<year>1997</year>). <article-title>A decision-theoretic generalization of on-line learning and an application to boosting</article-title>. <source>Journal of Computer and System Sciences</source>, <volume>55</volume>(<issue>1</issue>): <fpage>119</fpage>–<lpage>139</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Futoma</surname> <given-names>J</given-names></string-name>, <string-name><surname>Morris</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lucas</surname> <given-names>J</given-names></string-name> (<year>2015</year>). <article-title>A comparison of models for predicting early hospital readmissions</article-title>. <source>Journal of Biomedical Informatics</source>, <volume>56</volume>: <fpage>229</fpage>–<lpage>238</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Gul</surname> <given-names>A</given-names></string-name>, <string-name><surname>Perperoglou</surname> <given-names>A</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Mahmoud</surname> <given-names>O</given-names></string-name>, <string-name><surname>Miftahuddin</surname> <given-names>M</given-names></string-name>, <string-name><surname>Adler</surname> <given-names>W</given-names></string-name>, <etal>et al.</etal> (<year>2018</year>). <article-title>Ensemble of a subset of knn classifiers</article-title>. <source>Advances in Data Analysis and Classification</source>, <volume>12</volume>(<issue>4</issue>): <fpage>827</fpage>–<lpage>840</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Ho</surname> <given-names>TK</given-names></string-name> (<year>1998</year>). <article-title>The random subspace method for constructing decision forests</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>, <volume>20</volume>(<issue>8</issue>): <fpage>1</fpage>–<lpage>22</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_023">
<mixed-citation publication-type="chapter"> <string-name><surname>Huang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Ling</surname> <given-names>CX</given-names></string-name> (<year>2003</year>). <chapter-title>Comparing naive Bayes, decision trees, and svm with auc and accuracy</chapter-title>. In: <source>Third IEEE International Conference on Data Mining</source>, <fpage>553</fpage>–<lpage>556</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_024">
<mixed-citation publication-type="chapter"> <string-name><surname>Huo</surname> <given-names>J</given-names></string-name>, <string-name><surname>Shi</surname> <given-names>T</given-names></string-name>, <string-name><surname>Chang</surname> <given-names>J</given-names></string-name> (<year>2016</year>). <chapter-title>Comparison of random forest and svm for electrical short-term load forecast with different data sources</chapter-title>. In: <source>2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS)</source>, <fpage>1077</fpage>–<lpage>1080</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_025">
<mixed-citation publication-type="other"> <string-name><surname>Kim</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>C</given-names></string-name> (2020). Influence diagnostics in support vector machines. <italic>Journal of the Korean Statistical Society</italic>, 1–22.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_026">
<mixed-citation publication-type="chapter"> <string-name><surname>Land</surname> <given-names>WH</given-names></string-name>, <string-name><surname>Schaffer</surname> <given-names>JD</given-names></string-name> (<year>2020</year>). <chapter-title>The support vector machine</chapter-title>. In: <source>The Art and Science of Machine Intelligence</source>, <fpage>45</fpage>–<lpage>76</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_027">
<mixed-citation publication-type="chapter"> <string-name><surname>Larsen</surname> <given-names>J</given-names></string-name>, <string-name><surname>Goutte</surname> <given-names>C</given-names></string-name> (<year>1999</year>). <chapter-title>On optimal data split for generalization estimation and model selection</chapter-title>. In: <source>Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No. 98TH8468)</source>, <fpage>225</fpage>–<lpage>234</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_028">
<mixed-citation publication-type="chapter"> <string-name><surname>Liang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>C</given-names></string-name> (<year>2011</year>). <chapter-title>An empirical study of bagging predictors for different learning algorithms</chapter-title>. In: <source>Twenty-Fifth AAAI Conference on Artificial Intelligence</source>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_029">
<mixed-citation publication-type="journal"> <string-name><surname>Matthews</surname> <given-names>BW</given-names></string-name> (<year>1975</year>). <article-title>Comparison of the predicted and observed secondary structure of t4 phage lysozyme</article-title>. <source>Biochimica et Biophysica Acta (BBA)-Protein Structure</source>, <volume>405</volume>(<issue>2</issue>): <fpage>442</fpage>–<lpage>451</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_030">
<mixed-citation publication-type="journal"> <string-name><surname>Moguerza</surname> <given-names>JM</given-names></string-name>, <string-name><surname>Muñoz</surname> <given-names>A</given-names></string-name> (<year>2006</year>). <article-title>Support vector machines with applications</article-title>. <source>Statistical Science</source>, <volume>21</volume>(<issue>3</issue>): <fpage>322</fpage>–<lpage>336</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_031">
<mixed-citation publication-type="journal"> <string-name><surname>Ouedraogo</surname> <given-names>I</given-names></string-name>, <string-name><surname>Defourny</surname> <given-names>P</given-names></string-name>, <string-name><surname>Vanclooster</surname> <given-names>M</given-names></string-name> (<year>2019</year>). <article-title>Application of random forest regression and comparison of its performance to multiple linear regression in modeling groundwater nitrate concentration at the african continent scale</article-title>. <source>Hydrogeology Journal</source>, <volume>27</volume>(<issue>3</issue>): <fpage>1081</fpage>–<lpage>1098</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_032">
<mixed-citation publication-type="journal"> <string-name><surname>Pal</surname> <given-names>M</given-names></string-name> (<year>2005</year>). <article-title>Random forest classifier for remote sensing classification</article-title>. <source>International Journal of Remote Sensing</source>, <volume>26</volume>(<issue>1</issue>): <fpage>217</fpage>–<lpage>222</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_033">
<mixed-citation publication-type="journal"> <string-name><surname>Probst</surname> <given-names>P</given-names></string-name>, <string-name><surname>Wright</surname> <given-names>MN</given-names></string-name>, <string-name><surname>Boulesteix</surname> <given-names>AL</given-names></string-name> (<year>2019</year>). <article-title>Hyperparameters and tuning strategies for random forest</article-title>. <source>Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</source>, <volume>9</volume>(<issue>3</issue>): <fpage>e1301</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_034">
<mixed-citation publication-type="journal"> <string-name><surname>Rodriguez-Galiano</surname> <given-names>V</given-names></string-name>, <string-name><surname>Sanchez-Castillo</surname> <given-names>M</given-names></string-name>, <string-name><surname>Chica-Olmo</surname> <given-names>M</given-names></string-name>, <string-name><surname>Chica-Rivas</surname> <given-names>M</given-names></string-name> (<year>2015</year>). <article-title>Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines</article-title>. <source>Ore Geology Reviews</source>, <volume>71</volume>: <fpage>804</fpage>–<lpage>818</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_035">
<mixed-citation publication-type="journal"> <string-name><surname>Roy</surname> <given-names>MH</given-names></string-name>, <string-name><surname>Larocque</surname> <given-names>D</given-names></string-name> (<year>2012</year>). <article-title>Robustness of random forests for regression</article-title>. <source>Journal of Nonparametric Statistics</source>, <volume>24</volume>(<issue>4</issue>): <fpage>993</fpage>–<lpage>1006</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_036">
<mixed-citation publication-type="journal"> <string-name><surname>Sage</surname> <given-names>AJ</given-names></string-name>, <string-name><surname>Genschel</surname> <given-names>U</given-names></string-name>, <string-name><surname>Nettleton</surname> <given-names>D</given-names></string-name> (<year>2020</year>). <article-title>Tree aggregation for random forest class probability estimation</article-title>. <source>Statistical Analysis and Data Mining: The ASA Data Science Journal</source>. <volume>13</volume>(<issue>2</issue>): <fpage>134</fpage>–<lpage>150</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_037">
<mixed-citation publication-type="journal"> <string-name><surname>Scornet</surname> <given-names>E</given-names></string-name> (<year>2016</year>). <article-title>Random forests and kernel methods</article-title>. <source>IEEE Transactions on Information Theory</source>, <volume>62</volume>(<issue>3</issue>): <fpage>1485</fpage>–<lpage>1500</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_038">
<mixed-citation publication-type="journal"> <string-name><surname>Scornet</surname> <given-names>E</given-names></string-name>, <string-name><surname>Biau</surname> <given-names>G</given-names></string-name>, <string-name><surname>Vert</surname> <given-names>JP</given-names></string-name>, <etal>et al.</etal> (<year>2015</year>). <article-title>Consistency of random forests</article-title>. <source>The Annals of Statistics</source>, <volume>43</volume>(<issue>4</issue>): <fpage>1716</fpage>–<lpage>1741</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_039">
<mixed-citation publication-type="chapter"> <string-name><surname>Shivaswamy</surname> <given-names>PK</given-names></string-name>, <string-name><surname>Chu</surname> <given-names>W</given-names></string-name>, <string-name><surname>Jansche</surname> <given-names>M</given-names></string-name> (<year>2007</year>). <chapter-title>A support vector approach to censored targets</chapter-title>. In: <source>Seventh IEEE International Conference on Data Mining (ICDM 2007)</source>, <fpage>655</fpage>–<lpage>660</lpage>. <publisher-name>IEEE</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_040">
<mixed-citation publication-type="journal"> <string-name><surname>Statnikov</surname> <given-names>A</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Aliferis</surname> <given-names>CF</given-names></string-name> (<year>2008</year>). <article-title>A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification</article-title>. <source>BMC Bioinformatics</source>, <volume>9</volume>(<issue>1</issue>): <fpage>319</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_041">
<mixed-citation publication-type="chapter"> <string-name><surname>Syarif</surname> <given-names>I</given-names></string-name>, <string-name><surname>Zaluska</surname> <given-names>E</given-names></string-name>, <string-name><surname>Prugel-Bennett</surname> <given-names>A</given-names></string-name>, <string-name><surname>Wills</surname> <given-names>G</given-names></string-name> (<year>2012</year>). <chapter-title>Application of bagging, boosting and stacking to intrusion detection</chapter-title>. In: <source>International Workshop on Machine Learning and Data Mining in Pattern Recognition</source>, <fpage>593</fpage>–<lpage>602</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_042">
<mixed-citation publication-type="journal"> <string-name><surname>Tang</surname> <given-names>F</given-names></string-name>, <string-name><surname>Ishwaran</surname> <given-names>H</given-names></string-name> (<year>2017</year>). <article-title>Random forest missing data algorithms</article-title>. <source>Statistical Analysis and Data Mining: The ASA Data Science Journal</source>, <volume>10</volume>(<issue>6</issue>): <fpage>363</fpage>–<lpage>377</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_043">
<mixed-citation publication-type="journal"> <string-name><surname>Van der Laan</surname> <given-names>MJ</given-names></string-name>, <string-name><surname>Polley</surname> <given-names>EC</given-names></string-name>, <string-name><surname>Hubbard</surname> <given-names>AE</given-names></string-name> (<year>2007</year>). <article-title>Super learner</article-title>. <source>Statistical Applications in Genetics and Molecular Biology</source>, <volume>6</volume>(<issue>1</issue>).</mixed-citation>
</ref>
<ref id="j_jds1025_ref_044">
<mixed-citation publication-type="journal"> <string-name><surname>Vapnik</surname> <given-names>VN</given-names></string-name> (<year>1999</year>). <article-title>An overview of statistical learning theory</article-title>. <source>IEEE Transactions on Neural Networks</source>, <volume>10</volume>(<issue>5</issue>): <fpage>988</fpage>–<lpage>999</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_045">
<mixed-citation publication-type="journal"> <string-name><surname>Wang</surname> <given-names>BX</given-names></string-name>, <string-name><surname>Japkowicz</surname> <given-names>N</given-names></string-name> (<year>2010</year>). <article-title>Boosting support vector machines for imbalanced data sets</article-title>. <source>Knowledge and Information Systems</source>, <volume>25</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>20</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_046">
<mixed-citation publication-type="chapter"> <string-name><surname>Wu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Chang</surname> <given-names>EY</given-names></string-name> (<year>2003</year>). <chapter-title>Class-boundary alignment for imbalanced dataset learning</chapter-title>. In: <source>ICML 2003 Workshop on Learning from Imbalanced Data Sets II</source>, <fpage>49</fpage>–<lpage>56</lpage>. <conf-loc>Washington, DC</conf-loc>.</mixed-citation>
</ref>
<ref id="j_jds1025_ref_047">
<mixed-citation publication-type="journal"> <string-name><surname>Zareapoor</surname> <given-names>M</given-names></string-name>, <string-name><surname>Shamsolmoali</surname> <given-names>P</given-names></string-name>, <etal>et al.</etal> (<year>2015</year>). <article-title>Application of credit card fraud detection: Based on bagging ensemble classifier</article-title>. <source>Procedia Computer Science</source>, <volume>48</volume>: <fpage>679</fpage>–<lpage>685</lpage>. <comment>2015</comment>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
