<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1144</article-id>
<article-id pub-id-type="doi">10.6339/24-JDS1144</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Data Science in Action</subject></subj-group></article-categories>
<title-group>
<article-title>Traditional and GenAI Text Analysis of COVID-19 Pandemic Trends in Hospital Community Benefits IRS Documentation</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Hadley</surname><given-names>Emily</given-names></name><email xlink:href="mailto:ehadley@rti.org">ehadley@rti.org</email><xref ref-type="aff" rid="j_jds1144_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Marcial</surname><given-names>Laura</given-names></name><xref ref-type="aff" rid="j_jds1144_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Quattrone</surname><given-names>Wes</given-names></name><xref ref-type="aff" rid="j_jds1144_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Bobashev</surname><given-names>Georgiy</given-names></name><xref ref-type="aff" rid="j_jds1144_aff_001">1</xref>
</contrib>
<aff id="j_jds1144_aff_001"><label>1</label><institution>RTI International</institution>, Durham, NC, <country>U.S.A</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:ehadley@rti.org">ehadley@rti.org</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2024</year></pub-date><pub-date pub-type="epub"><day>23</day><month>7</month><year>2024</year></pub-date><volume>22</volume><issue>3</issue><fpage>393</fpage><lpage>408</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1144_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>The zipped supplementary material file includes code and output for this analysis.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>1</day><month>12</month><year>2023</year></date><date date-type="accepted"><day>14</day><month>6</month><year>2024</year></date></history>
<permissions><copyright-statement>2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2024</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>The coronavirus disease 2019 (COVID-19) pandemic presented unique challenges to the U.S. healthcare system, particularly for nonprofit U.S. hospitals that are obligated to provide community benefits in exchange for federal tax exemptions. We sought to examine how hospitals initiated, modified, or disbanded community benefits programming in response to the COVID-19 pandemic. We used the free-response text in Part IV of Internal Revenue Service (IRS) Form 990 Schedule H (F990H) to assess health equity and disparities. We combined traditional key term frequency and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) clustering approaches with a novel Generative Pre-trained Transformer (GPT) 3.5 summarization approach. Our research reveals shifts in community benefits programming. We observed an increase in COVID-related terms starting in the 2019 tax year, indicating a pivot in community focus and efforts toward pandemic-related activities such as telehealth services and COVID-19 testing and prevention. The clustering analysis identified themes related to COVID-19 and community benefits. Generative Artificial Intelligence (GenAI) summarization with GPT3.5 contextualized these changes, revealing examples of healthcare system adaptations and program cancellations. However, GPT3.5 also encountered some accuracy and validation challenges. This multifaceted text analysis underscores the adaptability of hospitals in maintaining community health support during crises and suggests the potential of advanced AI tools in evaluating large-scale qualitative data for policy and public health research.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>generative artificial intelligence</kwd>
<kwd>hospital administration</kwd>
<kwd>natural language processing</kwd>
<kwd>text mining</kwd>
</kwd-group>
<funding-group><funding-statement>Funding for this work was provided by the Robert Wood Johnson Foundation under grants 77387 and 80508.</funding-statement></funding-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1144_reflist_001">
<title>References</title>
<ref id="j_jds1144_ref_001">
<mixed-citation publication-type="journal"> <string-name><surname>Alomari</surname> <given-names>A</given-names></string-name>, <string-name><surname>Idris</surname> <given-names>N</given-names></string-name>, <string-name><surname>Sabri</surname> <given-names>AQM</given-names></string-name>, <string-name><surname>Alsmadi</surname> <given-names>I</given-names></string-name> (<year>2022</year>). <article-title>Deep reinforcement and transfer learning for abstractive text summarization: A review</article-title>. <source><italic>Computer Speech &amp; Language</italic></source>, <volume>71</volume>: <fpage>101276</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.csl.2021.101276" xlink:type="simple">https://doi.org/10.1016/j.csl.2021.101276</ext-link></mixed-citation>
</ref>
<ref id="j_jds1144_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Azam</surname> <given-names>N</given-names></string-name>, <string-name><surname>Yao</surname> <given-names>J</given-names></string-name> (<year>2012</year>). <article-title>Comparison of term frequency and document frequency based feature selection metrics in text categorization</article-title>. <source><italic>Expert Systems with Applications</italic></source>, <volume>39</volume>(<issue>5</issue>): <fpage>4760</fpage>–<lpage>4768</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.eswa.2011.09.160" xlink:type="simple">https://doi.org/10.1016/j.eswa.2011.09.160</ext-link></mixed-citation>
</ref>
<ref id="j_jds1144_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Hadley</surname> <given-names>E</given-names></string-name>, <string-name><surname>Marcial</surname> <given-names>LH</given-names></string-name>, <string-name><surname>Quattrone</surname> <given-names>W</given-names></string-name>, <string-name><surname>Bobashev</surname> <given-names>G</given-names></string-name> (<year>2023</year>). <article-title>Text analysis of trends in health equity and disparities from the internal revenue service tax documentation submitted by US nonprofit hospitals between 2010 and 2019: Exploratory study</article-title>. <source><italic>Journal of Medical Internet Research</italic></source>, <volume>25</volume>(<issue>1</issue>): <fpage>e44330</fpage>. <comment>Company: Journal of Medical Internet Research Distributor: Journal of Medical Internet Research Institution: Journal of Medical Internet Research Label: Journal of Medical Internet Research Publisher: JMIR Publications Inc., Toronto, Canada.</comment> <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.2196/44330" xlink:type="simple">https://doi.org/10.2196/44330</ext-link></mixed-citation>
</ref>
<ref id="j_jds1144_ref_004">
<mixed-citation publication-type="other"> HDBSCAN (<year>2023</year>). How HDBSCAN Works — hdbscan 0.8.1 documentation.</mixed-citation>
</ref>
<ref id="j_jds1144_ref_005">
<mixed-citation publication-type="other"> <string-name><surname>Hearle</surname> <given-names>K</given-names></string-name> (<year>2020</year>). Coronavirus Pandemic and Community Benefit Reporting, <italic>Technical report</italic>, Verité Healthcare Consulting.</mixed-citation>
</ref>
<ref id="j_jds1144_ref_006">
<mixed-citation publication-type="other"> <string-name><surname>House</surname> <given-names>TW</given-names></string-name> (<year>2023</year>). <italic>Community Benefit.</italic></mixed-citation>
</ref>
<ref id="j_jds1144_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Nelson</surname> <given-names>LK</given-names></string-name> (<year>2020</year>). <article-title>Computational grounded theory: A methodological framework</article-title>. <source><italic>Sociological Methods &amp; Research</italic></source>, <volume>49</volume>(<issue>1</issue>): <fpage>3</fpage>–<lpage>42</lpage>. <comment>Publisher: SAGE Publications Inc.</comment></mixed-citation>
</ref>
<ref id="j_jds1144_ref_008">
<mixed-citation publication-type="book"> <string-name><surname>Ortiz</surname> <given-names>A</given-names></string-name>, <string-name><surname>Quattrone</surname> <given-names>W</given-names></string-name>, <string-name><surname>Underwood</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zmuda</surname> <given-names>M</given-names></string-name>, <string-name><surname>Goode</surname> <given-names>LSA</given-names></string-name>, <string-name><surname>Saur</surname> <given-names>C</given-names></string-name>, <etal>et al.</etal> (<year>2022</year>). <source><italic>The Development and Management of Community Benefit Insight: A Web-Based Resource That Aggregates US-Based Nonprofit Hospital Community Benefit Spending Data. RTI Press</italic></source>. <publisher-name>Publisher: RTI Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1144_ref_009">
<mixed-citation publication-type="other"> <string-name><surname>Atkeson</surname> <given-names>A.</given-names></string-name>, <string-name><surname>Rosenthal</surname> <given-names>J.</given-names></string-name> (<year>2020</year>). States Explore Pivoting Hospital Community Benefit Requirements to Address Disparities Exposed by COVID-19.</mixed-citation>
</ref>
<ref id="j_jds1144_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name>, <string-name><surname>Singh</surname> <given-names>SR</given-names></string-name>, <string-name><surname>Jacobson</surname> <given-names>PD</given-names></string-name> (<year>2013</year>). <article-title>Evaluating hospitals’ provision of community benefit: An argument for an outcome-based approach to nonprofit hospital tax exemption</article-title>. <source><italic>American Journal of Public Health</italic></source>, <volume>103</volume>(<issue>4</issue>): <fpage>612</fpage>–<lpage>616</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.2105/AJPH.2012.301048" xlink:type="simple">https://doi.org/10.2105/AJPH.2012.301048</ext-link></mixed-citation>
</ref>
<ref id="j_jds1144_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Saghafian</surname> <given-names>S</given-names></string-name>, <string-name><surname>Song</surname> <given-names>LD</given-names></string-name>, <string-name><surname>Raja</surname> <given-names>AS</given-names></string-name> (<year>2022</year>). <article-title>Towards a more efficient healthcare system: Opportunities and challenges caused by hospital closures amid the COVID-19 pandemic</article-title>. <source><italic>Health Care Management Science</italic></source>, <volume>25</volume>(<issue>2</issue>): <fpage>187</fpage>–<lpage>190</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s10729-022-09591-7" xlink:type="simple">https://doi.org/10.1007/s10729-022-09591-7</ext-link></mixed-citation>
</ref>
<ref id="j_jds1144_ref_012">
<mixed-citation publication-type="other"> scikit learn (<year>2023</year>). sklearn.feature_extraction.text.TfidfVectorizer.</mixed-citation>
</ref>
<ref id="j_jds1144_ref_013">
<mixed-citation publication-type="other"> Service IR (<year>2023</year>a). About Schedule H (Form 990), Hospitals |. Internal Revenue Service.</mixed-citation>
</ref>
<ref id="j_jds1144_ref_014">
<mixed-citation publication-type="other"> Service IR (<year>2023</year>b). Charitable Hospitals - General Requirements for Tax-Exemption Under Section 501(c) (3) |. Internal Revenue Service.</mixed-citation>
</ref>
<ref id="j_jds1144_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Williams</surname> <given-names>D</given-names></string-name>, <string-name><surname>Reiter</surname> <given-names>KL</given-names></string-name>, <string-name><surname>Pink</surname> <given-names>GH</given-names></string-name>, <string-name><surname>Holmes</surname> <given-names>GM</given-names></string-name>, <string-name><surname>Song</surname> <given-names>PH</given-names></string-name> (<year>2020</year>). <article-title>Rural hospital mergers increased between 2005 and 2016—what did those hospitals look like?</article-title> <source><italic>INQUIRY: The Journal of Health Care Organization, Provision, and Financing</italic></source>, <volume>57</volume>: <fpage>0046958020935666</fpage>. <comment>Publisher: SAGE Publications Inc.</comment></mixed-citation>
</ref>
<ref id="j_jds1144_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Young</surname> <given-names>GJ</given-names></string-name>, <string-name><surname>Chou</surname> <given-names>CH</given-names></string-name>, <string-name><surname>Alexander</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>SYD</given-names></string-name>, <string-name><surname>Raver</surname> <given-names>E</given-names></string-name> (<year>2013</year>). <article-title>Provision of community benefits by tax-exempt U.S. hospitals</article-title>. <source><italic>The New England Journal of Medicine</italic></source>, <volume>368</volume>(<issue>16</issue>): <fpage>1519</fpage>–<lpage>1527</lpage>. <comment>Publisher: Massachusetts Medical Society</comment>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1056/NEJMsa1210239" xlink:type="simple">https://doi.org/10.1056/NEJMsa1210239</ext-link></mixed-citation>
</ref>
<ref id="j_jds1144_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Zare</surname> <given-names>H</given-names></string-name>, <string-name><surname>Eisenberg</surname> <given-names>M</given-names></string-name>, <string-name><surname>Anderson</surname> <given-names>G</given-names></string-name> (<year>2021</year>). <article-title>Charity care and community benefit in non-profit hospitals: Definition and requirements</article-title>. <source><italic>INQUIRY: The Journal of Health Care Organization, Provision, and Financing</italic></source>, <volume>58</volume>: <fpage>00469580211028180</fpage>. <comment>Publisher: SAGE Publications Inc.</comment></mixed-citation>
</ref>
</ref-list>
</back>
</article>
