<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1068</article-id>
<article-id pub-id-type="doi">10.6339/22-JDS1068</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Data Science in Action</subject></subj-group></article-categories>
<title-group>
<article-title>Identifying Prerequisite Courses in Undergraduate Biology Using Machine Learning</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Lee</surname><given-names>Youngjin</given-names></name><email xlink:href="mailto:Youngjin.Lee@unt.edu">Youngjin.Lee@unt.edu</email><xref ref-type="aff" rid="j_jds1068_aff_001">1</xref><xref ref-type="fn" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1068_aff_001"><label>1</label>3940 N. Elm St., Denton, TX 76207, <institution>University of North Texas</institution>, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Email: <ext-link ext-link-type="uri" xlink:href="mailto:Youngjin.Lee@unt.edu">Youngjin.Lee@unt.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2023</year></pub-date><pub-date pub-type="epub"><day>20</day><month>10</month><year>2022</year></pub-date><volume>21</volume><issue>4</issue><fpage>745</fpage><lpage>760</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1068_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>This includes the data file containing the training and test sets analyzed in the study, and all R code used in the analysis along with an explanatory README.txt file.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>6</day><month>5</month><year>2022</year></date><date date-type="accepted"><day>16</day><month>9</month><year>2022</year></date></history>
<permissions><copyright-statement>2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2023</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Many undergraduate students who matriculated in Science, Technology, Engineering and Mathematics (STEM) degree programs drop out or switch their major. Previous studies indicate that performance of students in prerequisite courses is important for attrition of students in STEM. This study analyzed demographic information, ACT/SAT score, and performance of students in freshman year courses to develop machine learning models predicting their success in earning a bachelor’s degree in biology. The predictive model based on Random Forest (RF) and Extreme Gradient Boosting (XGBoost) showed a better performance in terms of AUC (Area Under the Curve) with more balanced sensitivity and specificity than Logistic Regression (LR), K-Nearest Neighbor (KNN), and Neural Network (NN) models. An explainable machine learning approach called break-down was employed to identify important freshman year courses that could have a larger impact on student success at the biology degree program and student levels. More important courses identified at the program level can help program coordinators to prioritize their effort in addressing student attrition while more important courses identified at the student level can help academic advisors to provide more personalized, data-driven guidance to students.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>attrition</kwd>
<kwd>Educational Data Mining</kwd>
<kwd>Learning Analytics</kwd>
<kwd>STEM education</kwd>
<kwd>student success</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1068_reflist_001">
<title>References</title>
<ref id="j_jds1068_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Alexander</surname> <given-names>C</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>E</given-names></string-name>, <string-name><surname>Grumbach</surname> <given-names>K</given-names></string-name> (<year>2009</year>). <article-title>How leaky is the health career pipeline? Minority student achievement in college gateway courses</article-title>. <source><italic>Academic Medicine</italic></source>, <volume>84</volume>(<issue>6</issue>): <fpage>797</fpage>–<lpage>802</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Altman</surname> <given-names>NS</given-names></string-name> (<year>1992</year>). <article-title>An introduction to kernel and nearest-neighbor nonparametric regression</article-title>. <source><italic>American Statistician</italic></source>, <volume>46</volume>(<issue>3</issue>): <fpage>175</fpage>–<lpage>185</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_003">
<mixed-citation publication-type="chapter"> <string-name><surname>Aulck</surname> <given-names>L</given-names></string-name>, <string-name><surname>Nambi</surname> <given-names>D</given-names></string-name>, <string-name><surname>Velagapudi</surname> <given-names>N</given-names></string-name>, <string-name><surname>Blumenstock</surname> <given-names>J</given-names></string-name>, <string-name><surname>West</surname> <given-names>J</given-names></string-name> (<year>2019</year>). <chapter-title>Mining university registrar records to predict first-year undergraduate attrition</chapter-title>. In: <source><italic>Proceedings of the 12th International Conference on Educational Data Mining</italic></source> (<string-name><given-names>CF</given-names> <surname>Lynch</surname></string-name>, <string-name><given-names>A</given-names> <surname>Merceron</surname></string-name>, <string-name><given-names>M</given-names> <surname>Desmarais</surname></string-name>, <string-name><given-names>R</given-names> <surname>Nkambou</surname></string-name>, eds.), <fpage>9</fpage>–<lpage>18</lpage>. <publisher-name>International Educational Data Mining Society</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_004">
<mixed-citation publication-type="book"> <string-name><surname>Ausubel</surname> <given-names>DP</given-names></string-name> (<year>1963</year>). <source><italic>The Psychology of Meaningful Verbal Learning</italic></source>. <publisher-name>Grune &amp; Stratton</publisher-name>, <publisher-loc>New York, NY</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_005">
<mixed-citation publication-type="chapter"> <string-name><surname>Bayer</surname> <given-names>J</given-names></string-name>, <string-name><surname>Bydzovská</surname> <given-names>H</given-names></string-name>, <string-name><surname>Géryk</surname> <given-names>J</given-names></string-name>, <string-name><surname>Obsivac</surname> <given-names>T</given-names></string-name>, <string-name><surname>Popelinsky</surname> <given-names>L</given-names></string-name> (<year>2012</year>). <chapter-title>Predicting drop-out from social behavior of students</chapter-title>. In: <source><italic>Proceedings of the 5th International Conference on Educational Data Mining</italic></source> (<string-name><given-names>K</given-names> <surname>Yacef</surname></string-name>, <string-name><given-names>O</given-names> <surname>Zaïane</surname></string-name>, <string-name><given-names>A</given-names> <surname>Hershkovitz</surname></string-name>, <string-name><given-names>M</given-names> <surname>Yudelson</surname></string-name>, <string-name><given-names>J</given-names> <surname>Stamper</surname></string-name>, eds.), <fpage>103</fpage>–<lpage>109</lpage>. <publisher-name>International Educational Data Mining Society</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Berens</surname> <given-names>J</given-names></string-name>, <string-name><surname>Schneider</surname> <given-names>K</given-names></string-name>, <string-name><surname>Görtz</surname> <given-names>S</given-names></string-name>, <string-name><surname>Oster</surname> <given-names>S</given-names></string-name>, <string-name><surname>Burghoff</surname> <given-names>J</given-names></string-name> (<year>2019</year>). <article-title>Early detection of students at risk – predicting student dropouts using administrative student data and machine learning methods</article-title>. <source><italic>Journal of Educational Data Mining</italic></source>, <volume>11</volume>(<issue>3</issue>): <fpage>1</fpage>–<lpage>41</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Bettencourt</surname> <given-names>GM</given-names></string-name>, <string-name><surname>Manly</surname> <given-names>CA</given-names></string-name>, <string-name><surname>Kimball</surname> <given-names>E</given-names></string-name>, <string-name><surname>Wells</surname> <given-names>RS</given-names></string-name> (<year>2020</year>). <article-title>STEM degree completion and first-generation college students: A cumulative disadvantage approach to the outcomes gap</article-title>. <source><italic>Review of Higher Education</italic></source>, <volume>43</volume>(<issue>3</issue>): <fpage>753</fpage>–<lpage>779</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_008">
<mixed-citation publication-type="book"> <string-name><surname>Biecek</surname> <given-names>P</given-names></string-name>, <string-name><surname>Burzykowski</surname> <given-names>T</given-names></string-name> (<year>2021</year>). <source><italic>Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models</italic></source>. <publisher-name>Chapman &amp; Hall/CRC</publisher-name>, <publisher-loc>Boca Raton, FL</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Breiman</surname> <given-names>L</given-names></string-name> (<year>2001</year>). <article-title>Random forests</article-title>. <source><italic>Machine Learning</italic></source>, <volume>45</volume>(<issue>1</issue>): <fpage>5</fpage>–<lpage>32</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_010">
<mixed-citation publication-type="book"> <string-name><surname>Bruner</surname> <given-names>J</given-names></string-name> (<year>1974</year>). <source><italic>Toward a Theory of Instruction</italic></source>. <publisher-name>Belknap Press</publisher-name>, <publisher-loc>Cambridge, MA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_011">
<mixed-citation publication-type="chapter"> <string-name><surname>Chen</surname> <given-names>T</given-names></string-name>, <string-name><surname>Guestrin</surname> <given-names>C</given-names></string-name> (<year>2016</year>). <chapter-title>Xgboost: A scalable tree boosting system</chapter-title>. In: <source><italic>Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</italic></source> (<string-name><given-names>B</given-names> <surname>Krishnapuram</surname></string-name>, <string-name><given-names>M</given-names> <surname>Shah</surname></string-name>, <string-name><given-names>AJ</given-names> <surname>Smola</surname></string-name>, <string-name><given-names>C</given-names> <surname>Aggarwal</surname></string-name>, <string-name><given-names>D</given-names> <surname>Shen</surname></string-name>, <string-name><given-names>R</given-names> <surname>Rastogi</surname></string-name>, eds.), <fpage>785</fpage>–<lpage>794</lpage>. <publisher-name>Association for Computing Machinery</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_012">
<mixed-citation publication-type="other"> <string-name><surname>Chen</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ho</surname> <given-names>P</given-names></string-name> (2012). STEM in postsecondary education: Entrance, attrition, and coursetaking among 2003–2004 beginning postsecondary students. <italic>NCES Report no. 2013-152</italic>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_013">
<mixed-citation publication-type="other"> <string-name><surname>Chen</surname> <given-names>X</given-names></string-name>, <string-name><surname>Soldner</surname> <given-names>M</given-names></string-name> (2013). STEM attrition: College students’ paths into and out of STEM fields. <italic>NCES Report 2014-001</italic>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_014">
<mixed-citation publication-type="other"> <string-name><surname>Chen</surname> <given-names>X</given-names></string-name>, <string-name><surname>Weko</surname> <given-names>T</given-names></string-name> (2009). Students who study science, technology, engineering, and mathematics (STEM) in postsecondary education. <italic>NCES Report no. 2009-161</italic>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_015">
<mixed-citation publication-type="chapter"> <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Johri</surname> <given-names>A</given-names></string-name>, <string-name><surname>Rangwala</surname> <given-names>H</given-names></string-name> (<year>2018</year>). <chapter-title>Running out of STEM: A comparative study across STEM majors of college students at-risk of dropping out early</chapter-title>. In: <source><italic>Proceedings of the 8th International Conference on Learning Analytics and Knowledge</italic></source> (<string-name><given-names>A</given-names> <surname>Pardo</surname></string-name>, <string-name><given-names>K</given-names> <surname>Bartimote</surname></string-name>, <string-name><given-names>G</given-names> <surname>Lynch</surname></string-name>, <string-name><given-names>S</given-names> <surname>Buckingham Shum</surname></string-name>, <string-name><given-names>R</given-names> <surname>Ferguson</surname></string-name>, <string-name><given-names>A</given-names> <surname>Merceron</surname></string-name>, <string-name><given-names>X</given-names> <surname>Ochoa</surname></string-name>, eds.), <fpage>270</fpage>–<lpage>279</lpage>. <publisher-name>Society for Learning Analytics Research</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Cochran</surname> <given-names>JD</given-names></string-name>, <string-name><surname>Campbell</surname> <given-names>SM</given-names></string-name>, <string-name><surname>Baker</surname> <given-names>HM</given-names></string-name>, <string-name><surname>Leeds</surname> <given-names>EM</given-names></string-name> (<year>2013</year>). <article-title>The role of student characteristics in predicting retention in online courses</article-title>. <source><italic>Research in Higher Education</italic></source>, <volume>55</volume>(<issue>1</issue>): <fpage>27</fpage>–<lpage>48</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Cox</surname> <given-names>DR</given-names></string-name> (<year>1958</year>). <article-title>The regression analysis of binary sequences</article-title>. <source><italic>Journal of the Royal Statistical Society, Series B, Methodological</italic></source>, <volume>20</volume>(<issue>2</issue>): <fpage>215</fpage>–<lpage>232</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>Cromley</surname> <given-names>JG</given-names></string-name>, <string-name><surname>Perez</surname> <given-names>T</given-names></string-name>, <string-name><surname>Kaplan</surname> <given-names>A</given-names></string-name> (<year>2015</year>). <article-title>Undergraduate stem achievement and retention: Cognitive, motivational, and institutional factors and solutions</article-title>. <source><italic>Policy Insights From the Behavioral and Brain Sciences</italic></source>, <volume>3</volume>(<issue>1</issue>): <fpage>4</fpage>–<lpage>11</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Ctrobl</surname> <given-names>C</given-names></string-name>, <string-name><surname>Boulesteix</surname> <given-names>A</given-names></string-name>, <string-name><surname>Zeileis</surname> <given-names>A</given-names></string-name>, <string-name><surname>Hothorn</surname> <given-names>T</given-names></string-name> (<year>2007</year>). <article-title>Bias in random forest variable importance measures: Illustrations, sources and a solution</article-title>. <source><italic>BMC Bioinformatics</italic></source>, <volume>8</volume>: <fpage>25</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Dai</surname> <given-names>T</given-names></string-name>, <string-name><surname>Cromley</surname> <given-names>JG</given-names></string-name> (<year>2014</year>). <article-title>Changes in implicit theories of ability in biology and dropout from stem majors: A latent growth curve approach</article-title>. <source><italic>Contemporary Educational Psychology</italic></source>, <volume>39</volume>(<issue>3</issue>): <fpage>233</fpage>–<lpage>247</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Delen</surname> <given-names>D</given-names></string-name> (<year>2011</year>). <article-title>Predicting student attrition with data mining methods</article-title>. <source><italic>Journal of College Student Retention</italic></source>, <volume>13</volume>(<issue>1</issue>): <fpage>17</fpage>–<lpage>35</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Ehrenberg</surname> <given-names>RG</given-names></string-name> (<year>2010</year>). <article-title>Analyzing the factors that influence persistence rates in STEM field, majors: Introduction to the symposium</article-title>. <source><italic>Economics of Education Review</italic></source>, <volume>29</volume>: <fpage>888</fpage>–<lpage>891</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Fawcett</surname> <given-names>T</given-names></string-name> (<year>2006</year>). <article-title>An introduction to ROC analysis</article-title>. <source><italic>Pattern Recognition Letters</italic></source>, <volume>27</volume>(<issue>8</issue>): <fpage>861</fpage>–<lpage>874</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_024">
<mixed-citation publication-type="book"> <string-name><surname>Fernández</surname> <given-names>A</given-names></string-name>, <string-name><surname>García</surname> <given-names>S</given-names></string-name>, <string-name><surname>Galar</surname> <given-names>M</given-names></string-name>, <string-name><surname>Prati</surname> <given-names>RC</given-names></string-name>, <string-name><surname>Krawczyk</surname> <given-names>B</given-names></string-name>, <string-name><surname>Herrera</surname> <given-names>F</given-names></string-name> (<year>2018</year>). <source><italic>Learning from Imbalanced Data Sets</italic></source>. <publisher-name>Springer</publisher-name>, <publisher-loc>Cham, Switzerland</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_025">
<mixed-citation publication-type="book"> <string-name><surname>Gagné</surname> <given-names>RM</given-names></string-name>, <string-name><surname>Briggs</surname> <given-names>LJ</given-names></string-name> (<year>1974</year>). <source><italic>Principles of Instructional Design</italic></source>. <publisher-name>Holt, Rinehart &amp; Winston</publisher-name>, <publisher-loc>New York, NY</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Gasiewski</surname> <given-names>JA</given-names></string-name>, <string-name><surname>Eagan</surname> <given-names>MK</given-names></string-name>, <string-name><surname>Garcia</surname> <given-names>GA</given-names></string-name>, <string-name><surname>Hurtado</surname> <given-names>S</given-names></string-name>, <string-name><surname>Chang</surname> <given-names>M</given-names></string-name> (<year>2012</year>). <article-title>From gatekeeping to engagement: A multicontextual, mixed method study of student academic engagement in introductory stem courses</article-title>. <source><italic>Research in Higher Education</italic></source>, <volume>53</volume>(<issue>2</issue>): <fpage>229</fpage>–<lpage>261</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_027">
<mixed-citation publication-type="book"> <string-name><surname>James</surname> <given-names>G</given-names></string-name>, <string-name><surname>Witten</surname> <given-names>D</given-names></string-name>, <string-name><surname>Hastie</surname> <given-names>T</given-names></string-name>, <string-name><surname>Tibshirani</surname> <given-names>R</given-names></string-name> (<year>2013</year>). <source><italic>An Introduction to Statistical Learning: With Application in R</italic></source>. <publisher-name>Springer</publisher-name>, <publisher-loc>New York, NY</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_028">
<mixed-citation publication-type="book"> <string-name><surname>Khun</surname> <given-names>M</given-names></string-name>, <string-name><surname>Silge</surname> <given-names>J</given-names></string-name> (<year>2022</year>). <source><italic>Tidy Modeling with R: A Framework for Modeling in the Tidyverse</italic></source>. <publisher-name>O’reilly</publisher-name>, <publisher-loc>Sebastopol, CA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_029">
<mixed-citation publication-type="book"> <string-name><surname>Kleinbaum</surname> <given-names>DG</given-names></string-name>, <string-name><surname>Klein</surname> <given-names>M</given-names></string-name> (<year>2010</year>). <source><italic>Logistic Regression: A Self-Learning Text</italic></source>. <publisher-name>Springer</publisher-name>, <publisher-loc>New York, NY</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_030">
<mixed-citation publication-type="chapter"> <string-name><surname>Kovačić</surname> <given-names>ZJ</given-names></string-name> (<year>2010</year>). <chapter-title>Early prediction of student success: Mining students enrolment data</chapter-title>. In: <source><italic>Proceedings of Informing Science IT Education Conference</italic></source> (<string-name><given-names>E</given-names> <surname>Cohen</surname></string-name>, ed.), <fpage>647</fpage>–<lpage>665</lpage>. <publisher-name>Informing Science Institute</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_031">
<mixed-citation publication-type="journal"> <string-name><surname>Kuhn</surname> <given-names>TK</given-names></string-name>, <string-name><surname>Gordon</surname> <given-names>VN</given-names></string-name>, <string-name><surname>Webber</surname> <given-names>J</given-names></string-name> (<year>2006</year>). <article-title>The advising and counseling continuum: Triggers for referral</article-title>. <source><italic>NACADA Journal</italic></source>, <volume>26</volume>: <fpage>24</fpage>–<lpage>31</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_032">
<mixed-citation publication-type="journal"> <string-name><surname>Le</surname> <given-names>H</given-names></string-name>, <string-name><surname>Robbins</surname> <given-names>SB</given-names></string-name>, <string-name><surname>Westrick</surname> <given-names>P</given-names></string-name> (<year>2014</year>). <article-title>Predicting student enrollment and persistence in college stem fields using an expanded P-E fit framework: A large-scale multilevel study</article-title>. <source><italic>Journal of Applied Psychology</italic></source>, <volume>99</volume>(<issue>5</issue>): <fpage>915</fpage>–<lpage>947</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_033">
<mixed-citation publication-type="journal"> <string-name><surname>Lee</surname> <given-names>YG</given-names></string-name>, <string-name><surname>Ferrare</surname> <given-names>JJ</given-names></string-name> (<year>2019</year>). <article-title>Finding one’s place or losing the race? The consequences of stem departure for college dropout and degree completion</article-title>. <source><italic>Review of Higher Education</italic></source>, <volume>43</volume>(<issue>1</issue>): <fpage>221</fpage>–<lpage>261</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_034">
<mixed-citation publication-type="journal"> <string-name><surname>Loh</surname> <given-names>WY</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>P</given-names></string-name> (<year>2021</year>). <article-title>Variable importance scores</article-title>. <source><italic>Journal of Data Science</italic></source>, <volume>19</volume>(<issue>4</issue>): <fpage>569</fpage>–<lpage>592</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_035">
<mixed-citation publication-type="book"> <string-name><surname>Malcom</surname> <given-names>S</given-names></string-name>, <string-name><surname>Feder</surname> <given-names>M</given-names></string-name> (<year>2016</year>). <source><italic>Barriers and Opportunities for 2-year and 4-year STEM Degree: Systematic Change to Support Students’ Diverse Pathways</italic></source>. <publisher-name>The National Academies Press</publisher-name>, <publisher-loc>Washington, DC</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_036">
<mixed-citation publication-type="journal"> <string-name><surname>McCulloch</surname> <given-names>WS</given-names></string-name>, <string-name><surname>Pitts</surname> <given-names>W</given-names></string-name> (<year>1943</year>). <article-title>A logical calculus of the ideas immanent in nervous activity</article-title>. <source> <italic>The Bulletin of Mathematical Biophysics</italic></source>, <volume>5</volume>(<issue>4</issue>): <fpage>115</fpage>–<lpage>133</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_037">
<mixed-citation publication-type="chapter"> <string-name><surname>Nagy</surname> <given-names>M</given-names></string-name>, <string-name><surname>Molontay</surname> <given-names>R</given-names></string-name> (<year>2018</year>). <chapter-title>Predicting dropout in higher education based on secondary school performance</chapter-title>. In: <source><italic>Proceedings of the 22nd IEEE International Conference on Intelligent Engineering Systems</italic></source>, <fpage>389</fpage>–<lpage>394</lpage>. <publisher-name>Institute of Electrical and Electronics Engineers</publisher-name>. <comment>Paper is available in IEEE Xplore: <uri>https://ieeexplore.ieee.org/abstract/document/8523888</uri></comment>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_038">
<mixed-citation publication-type="other"> National Academy of Education (2017). <italic>Big data in education: Balancing the benefits of educational research and student privacy: A workshop summary</italic>. National Academy of Education, Washington, DC.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_039">
<mixed-citation publication-type="other"> National Science Board (2018). Science &amp; engineering indicators 2018. <italic>NSB Report 2018-1</italic>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_040">
<mixed-citation publication-type="other"> <string-name><surname>Olson</surname> <given-names>S</given-names></string-name>, <string-name><surname>Riordan</surname> <given-names>DG</given-names></string-name> (2012). Engage to excel: Producing one million additional college graduates with degrees in science, technology, engineering, and mathematics. <italic>Report to the President</italic>. Washington, DC.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_041">
<mixed-citation publication-type="journal"> <string-name><surname>Patrick</surname> <given-names>AD</given-names></string-name>, <string-name><surname>Prybutok</surname> <given-names>AN</given-names></string-name>, <string-name><surname>Borrego</surname> <given-names>M</given-names></string-name> (<year>2021</year>). <article-title>Predicting persistence in engineering through an engineering identity scale</article-title>. <source><italic>International Journal of Engineering Education</italic></source>, <volume>34</volume>(<issue>2A</issue>): <fpage>351</fpage>–<lpage>363</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_042">
<mixed-citation publication-type="journal"> <string-name><surname>Reigeluth</surname> <given-names>CM</given-names></string-name> (<year>1979</year>). <article-title>In search of a better way to organize instruction: The elaboration theory</article-title>. <source><italic>Journal of Instructional Development</italic></source>, <volume>2</volume>(<issue>3</issue>): <fpage>8</fpage>–<lpage>15</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_043">
<mixed-citation publication-type="journal"> <string-name><surname>Reigeluth</surname> <given-names>CM</given-names></string-name>, <string-name><surname>Merrill</surname> <given-names>MD</given-names></string-name>, <string-name><surname>Bunderson</surname> <given-names>CV</given-names></string-name> (<year>1978</year>). <article-title>The structure of subject matter content and its instructional design implications</article-title>. <source><italic>Instructional Science</italic></source>, <volume>7</volume>: <fpage>107</fpage>–<lpage>126</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_044">
<mixed-citation publication-type="journal"> <string-name><surname>Schwebel</surname> <given-names>D</given-names></string-name>, <string-name><surname>Walburn</surname> <given-names>N</given-names></string-name>, <string-name><surname>Klyce</surname> <given-names>K</given-names></string-name>, <string-name><surname>Jerrolds</surname> <given-names>K</given-names></string-name> (<year>2012</year>). <article-title>Efficacy of advising outreach on student retention, academic progress and achievement, and frequency of advising contacts: A longitudinal randomized trial</article-title>. <source><italic>NACADA Journal</italic></source>, <volume>32</volume>: <fpage>36</fpage>–<lpage>43</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_045">
<mixed-citation publication-type="journal"> <string-name><surname>Shewry</surname> <given-names>MC</given-names></string-name>, <string-name><surname>Wynn</surname> <given-names>HP</given-names></string-name> (<year>1987</year>). <article-title>Maximum entropy sampling</article-title>. <source><italic>Journal of Applied Statistics</italic></source>, <volume>14</volume>(<issue>2</issue>): <fpage>165</fpage>–<lpage>170</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_046">
<mixed-citation publication-type="journal"> <string-name><surname>Smith</surname> <given-names>M</given-names></string-name>, <string-name><surname>Therry</surname> <given-names>L</given-names></string-name>, <string-name><surname>Whale</surname> <given-names>J</given-names></string-name> (<year>2012</year>). <article-title>Developing a model for identifying students at risk of failure in a first year accounting unit</article-title>. <source><italic>Higher Education Studies</italic></source>, <volume>2</volume>(<issue>4</issue>): <fpage>91</fpage>–<lpage>102</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_047">
<mixed-citation publication-type="journal"> <string-name><surname>Sullivan</surname> <given-names>JF</given-names></string-name> (<year>2006</year>). <article-title>Broadening engineering’s participation-a call for K-16 engineering education</article-title>. <source><italic>The Bridge</italic></source>, <volume>36</volume>(<issue>2</issue>): <fpage>17</fpage>–<lpage>24</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_048">
<mixed-citation publication-type="journal"> <string-name><surname>Suresh</surname> <given-names>R</given-names></string-name> (<year>2007</year>). <article-title>The relationship between barrier courses and persistence in engineering</article-title>. <source><italic>Journal of College Student Retention</italic></source>, <volume>8</volume>(<issue>2</issue>): <fpage>215</fpage>–<lpage>239</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_049">
<mixed-citation publication-type="journal"> <string-name><surname>Thompson</surname> <given-names>R</given-names></string-name>, <string-name><surname>Bolin</surname> <given-names>G</given-names></string-name> (<year>2011</year>). <article-title>Indicators of success in stem majors: A cohort study</article-title>. <source><italic>Journal of College Admission</italic></source>, <volume>212</volume>: <fpage>18</fpage>–<lpage>24</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1068_ref_050">
<mixed-citation publication-type="book"> <string-name><surname>Xie</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Killewald</surname> <given-names>AA</given-names></string-name> (<year>2012</year>). <source><italic>Is American Science in Decline?</italic></source>. <publisher-name>Harvard University Press</publisher-name>, <publisher-loc>Cambridge, MA</publisher-loc>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
