<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1084</article-id>
<article-id pub-id-type="doi">10.6339/23-JDS1084</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>Random Forest of Interaction Trees for Estimating Individualized Treatment Regimes with Ordered Treatment Levels in Observational Studies</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Thorp</surname><given-names>Justin</given-names></name><xref ref-type="aff" rid="j_jds1084_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-7553-4264</contrib-id>
<name><surname>Levine</surname><given-names>Richard A.</given-names></name><email xlink:href="mailto:rlevine@sdsu.edu">rlevine@sdsu.edu</email><xref ref-type="aff" rid="j_jds1084_aff_001">1</xref><xref ref-type="aff" rid="j_jds1084_aff_002">2</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Li</surname><given-names>Luo</given-names></name><xref ref-type="aff" rid="j_jds1084_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Fan</surname><given-names>Juanjuan</given-names></name><email xlink:href="mailto:jjfan@sdsu.edu">jjfan@sdsu.edu</email><xref ref-type="aff" rid="j_jds1084_aff_001">1</xref><xref ref-type="aff" rid="j_jds1084_aff_002">2</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1084_aff_001"><label>1</label>Department of Mathematics and Statistics, <institution>San Diego State University</institution>, 5500 Campanile Drive, San Diego, CA 92182, <country>USA</country></aff>
<aff id="j_jds1084_aff_002"><label>2</label>Analytic Studies &amp; Institutional Research, <institution>San Diego State University</institution>, 5500 Campanile Drive, San Diego, CA 92182, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:rlevine@sdsu.edu">rlevine@sdsu.edu</ext-link> or <ext-link ext-link-type="uri" xlink:href="mailto:jjfan@sdsu.edu">jjfan@sdsu.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2023</year></pub-date><pub-date pub-type="epub"><day>2</day><month>2</month><year>2023</year></pub-date><volume>21</volume><issue>2</issue><fpage>391</fpage><lpage>411</lpage><history><date date-type="received"><day>31</day><month>7</month><year>2022</year></date><date date-type="accepted"><day>6</day><month>1</month><year>2023</year></date></history>
<permissions><copyright-statement>2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2023</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Traditional methods for evaluating a potential treatment have focused on the average treatment effect. However, there exist situations where individuals can experience significantly heterogeneous responses to a treatment. In these situations, one needs to account for the differences among individuals when estimating the treatment effect. <xref ref-type="bibr" rid="j_jds1084_ref_008">Li et al.</xref> (<xref ref-type="bibr" rid="j_jds1084_ref_008">2022</xref>) proposed a method based on random forest of interaction trees (RFIT) for a binary or categorical treatment variable, while incorporating the propensity score in the construction of random forest. Motivated by the need to evaluate the effect of tutoring sessions at a Math and Stat Learning Center (MSLC), we extend their approach to an ordinal treatment variable. Our approach improves upon RFIT for multiple treatments by incorporating the ordered structure of the treatment variable into the tree growing process. To illustrate the effectiveness of our proposed method, we conduct simulation studies where the results show that our proposed method has a lower mean squared error and higher optimal treatment classification, and is able to identify the most important variables that impact the treatment effect. We then apply the proposed method to estimate how the number of visits to the MSLC impacts an individual student’s probability of passing an introductory statistics course. Our results show that every student is recommended to go to the MSLC at least once and some can drastically improve their chance of passing the course by going the optimal number of times suggested by our analysis.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>educational data mining</kwd>
<kwd>generalized propensity scores</kwd>
<kwd>individualized treatment effect</kwd>
<kwd>machine learning</kwd>
<kwd>student success study</kwd>
</kwd-group>
<funding-group><award-group><funding-source xlink:href="https://doi.org/10.13039/100000001">National Science Foundation</funding-source><award-id>1633130</award-id></award-group><funding-statement>This research was supported in part by the National Science Foundation grant 1633130. </funding-statement></funding-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1084_reflist_001">
<title>References</title>
<ref id="j_jds1084_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Alemayehu</surname> <given-names>D</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Markatou</surname> <given-names>M</given-names></string-name> (<year>2017</year>). <article-title>A comparative study of subgroup identification methods for differential treatment effect: Performance metrics and recommendations</article-title>. <source><italic>Statistical Methods in Medical Research</italic></source>, <volume>27</volume>(<issue>12</issue>): <fpage>3658</fpage>–<lpage>3678</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Breiman</surname> <given-names>L</given-names></string-name> (<year>2001</year>). <article-title>Random forests</article-title>. <source><italic>Machine Learning</italic></source>, <volume>45</volume>: <fpage>5</fpage>–<lpage>32</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_003">
<mixed-citation publication-type="book"> <string-name><surname>Breiman</surname> <given-names>L</given-names></string-name>, <string-name><surname>Friedman</surname> <given-names>J</given-names></string-name>, <string-name><surname>Stone</surname> <given-names>C</given-names></string-name>, <string-name><surname>Olshen</surname> <given-names>R</given-names></string-name> (<year>1984</year>). <source><italic>Classification and Regression Trees</italic></source>. <publisher-name>Chapman and Hall/CRC</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Chipman</surname> <given-names>H</given-names></string-name> (<year>2010</year>). <article-title>BART: Bayesian additive regression trees</article-title>. <source><italic>The Annals of Applied Statistics</italic></source>, <volume>4</volume>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Dusseldorp</surname> <given-names>E</given-names></string-name>, <string-name><surname>Mechelen</surname> <given-names>I</given-names></string-name> (<year>2013</year>). <article-title>Qualitative interaction trees: A tool to identify qualitative treatment-subgroup interactions</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>33</volume>: <fpage>219</fpage>–<lpage>237</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Imbens</surname> <given-names>G</given-names></string-name> (<year>2000</year>). <article-title>The role of the propensity score in estimating dose-response functions</article-title>. <source><italic>Biometrika</italic></source>, <volume>87</volume>: <fpage>706</fpage>–<lpage>710</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Ishwaran</surname> <given-names>H</given-names></string-name>, <string-name><surname>Kogalur</surname> <given-names>U</given-names></string-name>, <string-name><surname>Gorodeski</surname> <given-names>E</given-names></string-name>, <string-name><surname>Minn</surname> <given-names>A</given-names></string-name>, <string-name><surname>Lauer</surname> <given-names>M</given-names></string-name> (<year>2012</year>). <article-title>High-dimensional variable selection for survival data</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>105</volume>: <fpage>205</fpage>–<lpage>217</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Li</surname> <given-names>L</given-names></string-name>, <string-name><surname>Levine</surname> <given-names>R</given-names></string-name>, <string-name><surname>Fan</surname> <given-names>J</given-names></string-name> (<year>2022</year>). <article-title>Causal effect random forest of interaction trees for learning individualized treatment regimes with multiple treatments in observational studies</article-title>. <source><italic>Stat</italic></source>, <volume>11</volume>(<issue>1</issue>), <fpage>e457</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Lipkovich</surname> <given-names>I</given-names></string-name>, <string-name><surname>Dmitrienko</surname> <given-names>A</given-names></string-name>, <string-name><surname>Denne</surname> <given-names>J</given-names></string-name>, <string-name><surname>Enas</surname> <given-names>G</given-names></string-name> (<year>2011</year>). <article-title>Subgroup identification based on differential effect search-a recursive partitioning method for establishing response to treatment in patient subpopulations</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>30</volume>(<issue>21</issue>): <fpage>2601</fpage>–<lpage>2621</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Rosenbaum</surname> <given-names>P</given-names></string-name>, <string-name><surname>Rubin</surname> <given-names>D</given-names></string-name> (<year>1983</year>). <article-title>The central role of the propensity score in observational studies for causal effects</article-title>. <source><italic>Biometrika</italic></source>, <volume>70</volume>: <fpage>41</fpage>–<lpage>55</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name> (<year>1974</year>). <article-title>Estimating causal effects of treatments in randomized and nonrandomized studies</article-title>. <source><italic>Journal of Educational Psychology</italic></source>, <volume>66</volume>(<issue>5</issue>): <fpage>688</fpage>–<lpage>701</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Sparapani</surname> <given-names>R</given-names></string-name>, <string-name><surname>Spanbauer</surname> <given-names>C</given-names></string-name>, <string-name><surname>McCulloch</surname> <given-names>R</given-names></string-name> (<year>2021</year>). <article-title>Nonparametric machine learning and efficient computation with Bayesian additive regression trees: The BART R package</article-title>. <source><italic>Journal of Statistical Software</italic></source>, <volume>97</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>66</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Su</surname> <given-names>X</given-names></string-name>, <string-name><surname>Peña</surname> <given-names>A</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Levine</surname> <given-names>R</given-names></string-name> (<year>2018</year>). <article-title>Random forests of interaction trees for estimating individualized treatment effects in randomized trials</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>37</volume>(<issue>17</issue>): <fpage>2547</fpage>–<lpage>2560</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Su</surname> <given-names>X</given-names></string-name>, <string-name><surname>Tsai</surname> <given-names>CL</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Nickerson</surname> <given-names>D</given-names></string-name>, <string-name><surname>Li</surname> <given-names>B</given-names></string-name> (<year>2009</year>). <article-title>Subgroup analysis via recursive partitioning</article-title>. <source><italic>Journal of Machine Learning Research</italic></source>, <volume>10</volume>: <fpage>141</fpage>–<lpage>158</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_015">
<mixed-citation publication-type="book"> <collab>R Core Team</collab> (<year>2021</year>). <source><italic>R: A Language and Environment for Statistical Computing</italic></source>. <publisher-name>R Foundation for Statistical Computing</publisher-name>, <publisher-loc>Vienna, Austria</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_016">
<mixed-citation publication-type="other"> <string-name><surname>Thorp</surname> <given-names>J</given-names></string-name>, <string-name><surname>Li</surname> <given-names>L</given-names></string-name>, <string-name><surname>Fan</surname> <given-names>J</given-names></string-name> (2022). CERFIT: Causal Effect Random Forest of Interaction Trees. <uri>https://cran.r-project.org/web/packages/CERFIT/index.html</uri>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Wager</surname> <given-names>S</given-names></string-name>, <string-name><surname>Athey</surname> <given-names>S</given-names></string-name> (<year>2018</year>). <article-title>Estimation and inference of heterogeneous treatment effects using random forests</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>113</volume>(<issue>523</issue>): <fpage>1228</fpage>–<lpage>1242</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1084_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>Zhu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Coffman</surname> <given-names>DL</given-names></string-name>, <string-name><surname>Ghosh</surname> <given-names>D</given-names></string-name> (<year>2015</year>). <article-title>A boosting algorithm for estimating generalized propensity scores with continuous treatments</article-title>. <source><italic>Journal of Causal Inference</italic></source>, <volume>3</volume>(<issue>1</issue>): <fpage>25</fpage>–<lpage>40</lpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
