<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1100</article-id>
<article-id pub-id-type="doi">10.6339/23-JDS1100</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Data Science in Action</subject></subj-group></article-categories>
<title-group>
<article-title>Quantifying Gender Disparity in Pre-Modern English Literature using Natural Language Processing</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-5988-8305</contrib-id>
<name><surname>Kejriwal</surname><given-names>Mayank</given-names></name><email xlink:href="mailto:kejriwal@isi.edu">kejriwal@isi.edu</email><xref ref-type="aff" rid="j_jds1100_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Nagaraj</surname><given-names>Akarsh</given-names></name><xref ref-type="aff" rid="j_jds1100_aff_001">1</xref>
</contrib>
<aff id="j_jds1100_aff_001"><label>1</label>Information Sciences Institute, Viterbi School of Engineering, <institution>University of Southern California</institution>, Marina del Rey, CA, <country>United States of America</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:kejriwal@isi.edu">kejriwal@isi.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2024</year></pub-date><pub-date pub-type="epub"><day>2</day><month>5</month><year>2023</year></pub-date><volume>22</volume><issue>1</issue><fpage>77</fpage><lpage>96</lpage><supplementary-material id="S1" content-type="document" xlink:href="jds1100_s001.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<title>Supplementary Material</title>
<p>The supplementary material contains details on: data preprocessing, character extraction and gender classification; additional quantitative details, including complete statistical significance results, for Hypotheses 1 and 2; quantitative linear regression results (including supporting statistics such as the analysis of variance); methodological details and results for the secondary analysis noted in Section 1.1 wherein we seek to use computational techniques from NLP to qualitatively assess the <italic>kinds</italic> of words associated with male and female character occurrences; and, a detailed description of some limitations of the study that were briefly discussed in the main text. Additionally, code, data and workbooks for replicating the analyses in this paper are also provided separately as supplementary material.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>26</day><month>5</month><year>2022</year></date><date date-type="accepted"><day>22</day><month>4</month><year>2023</year></date></history>
<permissions><copyright-statement>2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2024</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Research has continued to shed light on the extent and significance of gender disparity in social, cultural and economic spheres. More recently, computational tools from the data science and Natural Language Processing (NLP) communities have been proposed for measuring such disparity at scale using empirically rigorous methodologies. In this article, we contribute to this line of research by studying gender disparity in 2,443 copyright-expired literary texts published in the pre-modern period, defined in this work as the period ranging from the beginning of the nineteenth through the early twentieth century. Using a replicable data science methodology relying on publicly available and established NLP components, we extract three different gendered character prevalence measures within these texts. We use an extensive set of statistical tests to robustly demonstrate a significant disparity between the prevalence of female characters and male characters in pre-modern literature. We also show that the proportion of female characters in literary texts significantly increases in female-authored texts compared to the same proportion in male-authored texts. However, regression-based analysis shows that, over the 120 year period covered by the corpus, female character prevalence does not change significantly over time, and remains below the parity level of 50%, regardless of the gender of the author. Qualitative analyses further show that descriptions associated with female characters across the corpus are markedly different (and stereotypical) from the descriptions associated with male characters.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>digital humanities</kwd>
<kwd>gender-specific character prevalence</kwd>
<kwd>named entity recognition</kwd>
<kwd>project Gutenberg</kwd>
<kwd>word embedding</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1100_reflist_001">
<title>References</title>
<ref id="j_jds1100_ref_001">
<mixed-citation publication-type="book"> <string-name><surname>Adams</surname> <given-names>JE</given-names></string-name> (<year>2012</year>). <source><italic>A History of Victorian Literature</italic></source>, volume <volume>10</volume>. <publisher-name>John Wiley &amp; Sons</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_002">
<mixed-citation publication-type="chapter"> <string-name><surname>Agarwal</surname> <given-names>A</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>J</given-names></string-name>, <string-name><surname>Kamath</surname> <given-names>S</given-names></string-name>, <string-name><surname>Balasubramanian</surname> <given-names>S</given-names></string-name>, <string-name><surname>Dey</surname> <given-names>SA</given-names></string-name> (<year>2015</year>). <chapter-title>Key female characters in film have more to talk about besides men: Automating the Bechdel test</chapter-title>. In: <source><italic>Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</italic></source>, <fpage>830</fpage>–<lpage>840</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Asghari</surname> <given-names>F</given-names></string-name> (<year>2016</year>). <article-title>Methodological considerations in gender studies</article-title>. <source><italic>Interdisciplinary Studies in the Humanities</italic></source>, <volume>7</volume>(<issue>4</issue>): <fpage>105</fpage>–<lpage>127</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_004">
<mixed-citation publication-type="other"> <string-name><surname>Belkhyr</surname> <given-names>S</given-names></string-name> (2013). Disney animation: Global diffusion and local appropriation of culture. <italic>Études Caribéennes</italic>, (22).</mixed-citation>
</ref>
<ref id="j_jds1100_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Bonferroni</surname> <given-names>C</given-names></string-name> (<year>1936</year>). <article-title>Teoria statistica delle classi e calcolo delle probabilita</article-title>. <source><italic>Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commericiali di Firenze</italic></source>, <volume>8</volume>: <fpage>3</fpage>–<lpage>62</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Budzise-Weaver</surname> <given-names>T</given-names></string-name> (<year>2016</year>). <article-title>Developing a qualitative coding analysis of visual artwork for humanities research</article-title>. <source><italic>DHQ: Digital Humanities Quarterly</italic></source>, <volume>10</volume>(<issue>4</issue>): <fpage>33</fpage>–<lpage>45</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_007">
<mixed-citation publication-type="book"> <string-name><surname>Burke</surname> <given-names>RJ</given-names></string-name>, <string-name><surname>Mattis</surname> <given-names>MC</given-names></string-name> (<year>2013</year>). <source><italic>Women on Corporate Boards of Directors: International Challenges and Opportunities</italic></source>, volume <volume>14</volume>. <publisher-name>Springer Science &amp; Business Media</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_008">
<mixed-citation publication-type="chapter"> <string-name><surname>Burley</surname> <given-names>T</given-names></string-name>, <string-name><surname>Humble</surname> <given-names>L</given-names></string-name>, <string-name><surname>Sleeper</surname> <given-names>C</given-names></string-name>, <string-name><surname>Sticha</surname> <given-names>A</given-names></string-name>, <string-name><surname>Chesler</surname> <given-names>A</given-names></string-name>, <string-name><surname>Regan</surname> <given-names>P</given-names></string-name>, <etal>et al.</etal> (<year>2020</year>). <chapter-title>Nlp workflows for computational social science: Understanding triggers of state-led mass killings</chapter-title>. In: <source><italic>Practice and Experience in Advanced Research Computing</italic></source>, <fpage>152</fpage>–<lpage>159</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Cabrera</surname> <given-names>D</given-names></string-name>, <string-name><surname>Roy</surname> <given-names>D</given-names></string-name>, <string-name><surname>Chisolm</surname> <given-names>MS</given-names></string-name> (<year>2018</year>). <article-title>Social media scholarship and alternative metrics for academic promotion and tenure</article-title>. <source><italic>Journal of the American College of Radiology</italic></source>, <volume>15</volume>(<issue>1</issue>): <fpage>135</fpage>–<lpage>141</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.jacr.2017.09.012" xlink:type="simple">https://doi.org/10.1016/j.jacr.2017.09.012</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_010">
<mixed-citation publication-type="other"> <string-name><surname>Devlin</surname> <given-names>J</given-names></string-name>, <string-name><surname>Chang</surname> <given-names>MW</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>K</given-names></string-name>, <string-name><surname>Toutanova</surname> <given-names>K</given-names></string-name> (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint: <uri>https://arxiv.org/abs/1810.04805</uri></mixed-citation>
</ref>
<ref id="j_jds1100_ref_011">
<mixed-citation publication-type="other"> <string-name><surname>Digital Humanities Lab, MIT</surname></string-name> (2022). The gender novels project. <uri>http://gendernovels.digitalhumanitiesmit.org/info/gender_novels_overview</uri>. Accessed: 2022-09-29.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Fine</surname> <given-names>L</given-names></string-name> (<year>1998</year>). <article-title>Gender conflicts and their “dark” projections in coming of age white female southern novels</article-title>. <source><italic>Southern Quarterly</italic></source>, <volume>36</volume>(<issue>4</issue>): <fpage>121</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_013">
<mixed-citation publication-type="other"> <string-name><surname>Glavaš</surname> <given-names>G</given-names></string-name>, <string-name><surname>Nanni</surname> <given-names>F</given-names></string-name>, <string-name><surname>Ponzetto</surname> <given-names>SP</given-names></string-name> (2017). Cross-lingual classification of topics in political texts. <italic>Association for Computational Linguistics (ACL)</italic>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Greider</surname> <given-names>CW</given-names></string-name>, <string-name><surname>Sheltzer</surname> <given-names>JM</given-names></string-name>, <string-name><surname>Cantalupo</surname> <given-names>NC</given-names></string-name>, <string-name><surname>Copeland</surname> <given-names>WB</given-names></string-name>, <string-name><surname>Dasgupta</surname> <given-names>N</given-names></string-name>, <string-name><surname>Hopkins</surname> <given-names>N</given-names></string-name>, <etal>et al.</etal> (<year>2019</year>). <article-title>Increasing gender diversity in the stem research workforce</article-title>. <source><italic>Science</italic></source>, <volume>366</volume>(<issue>6466</issue>): <fpage>692</fpage>–<lpage>695</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1126/science.aaz0649" xlink:type="simple">https://doi.org/10.1126/science.aaz0649</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Han</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name> (<year>2021</year>). <article-title>Transformer based network for open information extraction</article-title>. <source><italic>Engineering Applications of Artificial Intelligence</italic></source>, <volume>102</volume>: <fpage>104262</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.engappai.2021.104262" xlink:type="simple">https://doi.org/10.1016/j.engappai.2021.104262</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Hoekstra</surname> <given-names>V</given-names></string-name> (<year>2010</year>). <article-title>Increasing the gender diversity of high courts: A comparative view</article-title>. <source><italic>Politics &amp; Gender</italic></source>, <volume>6</volume>(<issue>3</issue>): <fpage>474</fpage>–<lpage>484</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1017/S1743923X10000243" xlink:type="simple">https://doi.org/10.1017/S1743923X10000243</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Homans</surname> <given-names>M</given-names></string-name> (<year>1993</year>). <article-title>Dinah’s blush, maggie’s arm: Class, gender, and sexuality in george eliot’s early novels</article-title>. <source><italic>Victorian Studies</italic></source>, <volume>36</volume>(<issue>2</issue>): <fpage>155</fpage>–<lpage>178</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_018">
<mixed-citation publication-type="chapter"> <string-name><surname>Hovy</surname> <given-names>D</given-names></string-name>, <string-name><surname>Volkova</surname> <given-names>S</given-names></string-name>, <string-name><surname>Bamman</surname> <given-names>D</given-names></string-name>, <string-name><surname>Jurgens</surname> <given-names>D</given-names></string-name>, <string-name><surname>O’Connor</surname> <given-names>B</given-names></string-name>, <string-name><surname>Tsur</surname> <given-names>O</given-names></string-name>, <etal>et al.</etal> (<year>2017</year>). <chapter-title>Proceedings of the second workshop on nlp and computational social science</chapter-title>. In: <source><italic>Proceedings of the Second Workshop on NLP and Computational Social Science</italic></source>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Hu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Kearney</surname> <given-names>MW</given-names></string-name> (<year>2021</year>). <article-title>Gendered tweets: Computational text analysis of gender differences in political discussion on twitter</article-title>. <source><italic>Journal of Language and Social Psychology</italic></source>, <volume>40</volume>(<issue>4</issue>): <fpage>482</fpage>–<lpage>503</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/0261927X20969752" xlink:type="simple">https://doi.org/10.1177/0261927X20969752</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Hu</surname> <given-names>M</given-names></string-name>, <string-name><surname>Kejriwal</surname> <given-names>M</given-names></string-name> (<year>2022</year>). <article-title>Measuring spatio-textual affinities in twitter between two urban metropolises</article-title>. <source><italic>Journal of Computational Social Science</italic></source>, <volume>5</volume>(<issue>1</issue>), <fpage>227</fpage>–<lpage>252</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Jarynowski</surname> <given-names>A</given-names></string-name>, <string-name><surname>Paradowski</surname> <given-names>MB</given-names></string-name>, <string-name><surname>Buda</surname> <given-names>A</given-names></string-name> (<year>2019</year>). <article-title>Modelling communities and populations: An introduction to computational social science</article-title>. <source><italic>Studia Metodologiczne</italic></source>, <volume>39</volume>: <fpage>123</fpage>–<lpage>152</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_022">
<mixed-citation publication-type="book"> <string-name><surname>Jockers</surname> <given-names>ML</given-names></string-name> (<year>2013</year>). <source><italic>Macroanalysis: Digital Methods and Literary History</italic></source>. <publisher-name>University of Illinois Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_023">
<mixed-citation publication-type="book"> <string-name><surname>John</surname> <given-names>J</given-names></string-name> (<year>2016</year>). <source><italic>The Oxford Handbook of Victorian Literary Culture</italic></source>. <publisher-name>Oxford University Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Jordan</surname> <given-names>CE</given-names></string-name>, <string-name><surname>Clark</surname> <given-names>SJ</given-names></string-name>, <string-name><surname>Waldron</surname> <given-names>MA</given-names></string-name> (<year>2007</year>). <article-title>Gender bias and compensation in the executive suite of the fortune 100</article-title>. <source><italic>Journal of Organizational Culture, Communications and Conflict</italic></source>, <volume>11</volume>(<issue>1</issue>): <fpage>19</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_025">
<mixed-citation publication-type="journal"> <string-name><surname>Katz</surname> <given-names>E</given-names></string-name> (<year>1999</year>). <article-title>Theorizing diffusion: Tarde and sorokin revisited</article-title>. <source><italic>The Annals of the American Academy of Political and Social Science</italic></source>, <volume>566</volume>(<issue>1</issue>): <fpage>144</fpage>–<lpage>155</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/000271629956600112" xlink:type="simple">https://doi.org/10.1177/000271629956600112</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Keuschnigg</surname> <given-names>M</given-names></string-name>, <string-name><surname>Lovsjö</surname> <given-names>N</given-names></string-name>, <string-name><surname>Hedström</surname> <given-names>P</given-names></string-name> (<year>2018</year>). <article-title>Analytical sociology and computational social science</article-title>. <source><italic>Journal of Computational Social Science</italic></source>, <volume>1</volume>(<issue>1</issue>): <fpage>3</fpage>–<lpage>14</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s42001-017-0006-5" xlink:type="simple">https://doi.org/10.1007/s42001-017-0006-5</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_027">
<mixed-citation publication-type="other"> <string-name><surname>Lebert</surname> <given-names>M</given-names></string-name> (2009). A short history of ebooks. <uri>http://www.gutenberg.org/files/29801/29801-0.txt</uri>. Accessed: 2023-03-14.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_028">
<mixed-citation publication-type="book"> <string-name><surname>Legal Information Institute, Cornell Law School</surname></string-name> (<year>2020</year>). <source><italic>Gender Bias</italic></source>. <uri>https://www.law.cornell.edu/wex/gender_bias</uri>. Accessed: 2022-09-29.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_029">
<mixed-citation publication-type="other"> <string-name><surname>Liu</surname> <given-names>Y</given-names></string-name> (2019). Fine-tune bert for extractive summarization. arXiv preprint: <uri>https://arxiv.org/abs/1903.10318</uri></mixed-citation>
</ref>
<ref id="j_jds1100_ref_030">
<mixed-citation publication-type="other"> <string-name><surname>Mason</surname> <given-names>W</given-names></string-name>, <string-name><surname>Vaughan</surname> <given-names>JW</given-names></string-name>, <string-name><surname>Wallach</surname> <given-names>H</given-names></string-name> (2014). Computational social science and social computing.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_031">
<mixed-citation publication-type="book"> <string-name><surname>May Alcott</surname> <given-names>L</given-names></string-name> (<year>1868</year>). <source><italic>Little Women. Project Gutenberg</italic></source>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_032">
<mixed-citation publication-type="journal"> <string-name><surname>Miller</surname> <given-names>DL</given-names></string-name> (<year>2016</year>). <article-title>Gender and the artist archetype: Understanding gender inequality in artistic careers</article-title>. <source><italic>Sociology Compass</italic></source>, <volume>10</volume>(<issue>2</issue>): <fpage>119</fpage>–<lpage>131</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/soc4.12350" xlink:type="simple">https://doi.org/10.1111/soc4.12350</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_033">
<mixed-citation publication-type="chapter"> <string-name><surname>Milli</surname> <given-names>S</given-names></string-name>, <string-name><surname>Bamman</surname> <given-names>D</given-names></string-name> (<year>2016</year>). <chapter-title>Beyond canonical texts: A computational analysis of fanfiction</chapter-title>. In: <source><italic>Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing</italic></source>, <fpage>2048</fpage>–<lpage>2053</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_034">
<mixed-citation publication-type="journal"> <string-name><surname>Montasseri</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Khaghaninejad</surname> <given-names>MS</given-names></string-name>, <string-name><surname>Moloodi</surname> <given-names>A</given-names></string-name> (<year>2020</year>). <article-title>Gender representation in american movies: A corpus-based analysis</article-title>. <source><italic>The International Journal of Humanities</italic></source>, <volume>27</volume>(<issue>4</issue>): <fpage>42</fpage>–<lpage>53</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_035">
<mixed-citation publication-type="chapter"> <string-name><surname>Montjoye</surname> <given-names>YAd</given-names></string-name>, <string-name><surname>Quoidbach</surname> <given-names>J</given-names></string-name>, <string-name><surname>Robic</surname> <given-names>F</given-names></string-name>, <string-name><surname>Pentland</surname> <given-names>AS</given-names></string-name> (<year>2013</year>). <chapter-title>Predicting personality using novel mobile phone-based metrics</chapter-title>. In: <source><italic>International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction</italic></source>, <fpage>48</fpage>–<lpage>55</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_036">
<mixed-citation publication-type="journal"> <string-name><surname>Nadeau</surname> <given-names>D</given-names></string-name>, <string-name><surname>Sekine</surname> <given-names>S</given-names></string-name> (<year>2007</year>). <article-title>A survey of named entity recognition and classification</article-title>. <source><italic>Lingvisticae Investigationes</italic></source>, <volume>30</volume>(<issue>1</issue>): <fpage>3</fpage>–<lpage>26</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1075/li.30.1.03nad" xlink:type="simple">https://doi.org/10.1075/li.30.1.03nad</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_037">
<mixed-citation publication-type="journal"> <string-name><surname>Nagaraj</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kejriwal</surname> <given-names>M</given-names></string-name> (<year>2022</year>). <article-title>Dataset for studying gender disparity in english literary texts</article-title>. <source><italic>Data in Brief</italic></source>, <volume>41</volume>: <comment>107905</comment>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.dib.2022.107905" xlink:type="simple">https://doi.org/10.1016/j.dib.2022.107905</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_038">
<mixed-citation publication-type="journal"> <string-name><surname>Napierala</surname> <given-names>MA</given-names></string-name> (<year>2012</year>). <article-title>What is the bonferroni correction?</article-title> <source><italic>Aaos Now</italic></source>, <fpage>40</fpage>–<lpage>41</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_039">
<mixed-citation publication-type="journal"> <string-name><surname>Naseem</surname> <given-names>U</given-names></string-name>, <string-name><surname>Razzak</surname> <given-names>I</given-names></string-name>, <string-name><surname>Musial</surname> <given-names>K</given-names></string-name>, <string-name><surname>Imran</surname> <given-names>M</given-names></string-name> (<year>2020</year>). <article-title>Transformer based deep intelligent contextual embedding for twitter sentiment analysis</article-title>. <source><italic>Future Generation Computer Systems</italic></source>, <volume>113</volume>: <fpage>58</fpage>–<lpage>69</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.future.2020.06.050" xlink:type="simple">https://doi.org/10.1016/j.future.2020.06.050</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_040">
<mixed-citation publication-type="journal"> <string-name><surname>Nath</surname> <given-names>R</given-names></string-name>, <string-name><surname>Murthy</surname> <given-names>N</given-names></string-name> (<year>2004</year>). <article-title>A study of the relationship between internet diffusion and culture</article-title>. <source><italic>Journal of International Information Management</italic></source>, <volume>13</volume>(<issue>2</issue>): <fpage>5</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_041">
<mixed-citation publication-type="journal"> <string-name><surname>Nielsen</surname> <given-names>MW</given-names></string-name>, <string-name><surname>Bloch</surname> <given-names>CW</given-names></string-name>, <string-name><surname>Schiebinger</surname> <given-names>L</given-names></string-name> (<year>2018</year>). <article-title>Making gender diversity work for scientific discovery and innovation</article-title>. <source><italic>Nature Human Behaviour</italic></source>, <volume>2</volume>(<issue>10</issue>): <fpage>726</fpage>–<lpage>734</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/s41562-018-0433-1" xlink:type="simple">https://doi.org/10.1038/s41562-018-0433-1</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_042">
<mixed-citation publication-type="chapter"> <string-name><surname>Nixon</surname> <given-names>L</given-names></string-name> (<year>1994</year>). <chapter-title>Gender bias in archaeology</chapter-title>. In: <source><italic>Women in Ancient Societies</italic></source>, <fpage>1</fpage>–<lpage>23</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_043">
<mixed-citation publication-type="journal"> <string-name><surname>O’Connor</surname> <given-names>SD</given-names></string-name> (<year>1996</year>). <article-title>History of the women’s suffrage movement</article-title>. <source><italic>Vand. L. Rev.</italic></source>, <volume>49</volume>: <fpage>657</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_044">
<mixed-citation publication-type="journal"> <string-name><surname>Oh</surname> <given-names>D</given-names></string-name>, <string-name><surname>Dotsch</surname> <given-names>R</given-names></string-name>, <string-name><surname>Porter</surname> <given-names>J</given-names></string-name>, <string-name><surname>Todorov</surname> <given-names>A</given-names></string-name> (<year>2020</year>). <article-title>Gender biases in impressions from faces: Empirical studies and computational models</article-title>. <source><italic>Journal of Experimental Psychology. General</italic></source>, <volume>149</volume>(<issue>2</issue>): <fpage>323</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1037/xge0000638" xlink:type="simple">https://doi.org/10.1037/xge0000638</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_045">
<mixed-citation publication-type="journal"> <string-name><surname>Peters</surname> <given-names>K</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Kaplan</surname> <given-names>AM</given-names></string-name>, <string-name><surname>Ognibeni</surname> <given-names>B</given-names></string-name>, <string-name><surname>Pauwels</surname> <given-names>K</given-names></string-name> (<year>2013</year>). <article-title>Social media metrics–a framework and guidelines for managing social media</article-title>. <source><italic>Journal of Interactive Marketing</italic></source>, <volume>27</volume>(<issue>4</issue>): <fpage>281</fpage>–<lpage>298</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.intmar.2013.09.007" xlink:type="simple">https://doi.org/10.1016/j.intmar.2013.09.007</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_046">
<mixed-citation publication-type="journal"> <string-name><surname>Phillips</surname> <given-names>JM</given-names></string-name>, <string-name><surname>Malone</surname> <given-names>B</given-names></string-name> (<year>2014</year>). <article-title>Increasing racial/ethnic diversity in nursing to reduce health disparities and achieve health equity</article-title>. <source><italic>Public Health Reports</italic></source>, <volume>129</volume>(<issue>1_suppl2</issue>): <fpage>45</fpage>–<lpage>50</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/00333549141291S209" xlink:type="simple">https://doi.org/10.1177/00333549141291S209</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_047">
<mixed-citation publication-type="book"> <string-name><surname>Pilcher</surname> <given-names>J</given-names></string-name>, <string-name><surname>Whelehan</surname> <given-names>I</given-names></string-name> (<year>2016</year>). <source><italic>Key Concepts in Gender Studies</italic></source>. <publisher-name>Sage</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_048">
<mixed-citation publication-type="chapter"> <string-name><surname>Prabhumoye</surname> <given-names>S</given-names></string-name>, <string-name><surname>Choudhary</surname> <given-names>S</given-names></string-name>, <string-name><surname>Spiliopoulou</surname> <given-names>E</given-names></string-name>, <string-name><surname>Bogart</surname> <given-names>C</given-names></string-name>, <string-name><surname>Rose</surname> <given-names>C</given-names></string-name>, <string-name><surname>Black</surname> <given-names>AW</given-names></string-name> (<year>2017</year>). <chapter-title>Linguistic markers of influence in informal interactions</chapter-title>. In: <source><italic>Proceedings of the Second Workshop on NLP and Computational Social Science</italic></source>, <fpage>53</fpage>–<lpage>62</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_049">
<mixed-citation publication-type="other"> <string-name><surname>Project Gutenberg</surname></string-name> (1971). Project gutenberg. <uri>https://www.gutenberg.org/</uri>. Accessed: 2022-09-29.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_050">
<mixed-citation publication-type="journal"> <string-name><surname>Reagle</surname> <given-names>J</given-names></string-name>, <string-name><surname>Rhue</surname> <given-names>L</given-names></string-name> (<year>2011</year>). <article-title>Gender bias in Wikipedia and Britannica</article-title>. <source><italic>International Journal of Communication</italic></source>, <volume>5</volume>: <fpage>21</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_051">
<mixed-citation publication-type="journal"> <string-name><surname>Reddy</surname> <given-names>S</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>D</given-names></string-name>, <string-name><surname>Manning</surname> <given-names>CD</given-names></string-name> (<year>2019</year>). <article-title>Coqa: A conversational question answering challenge</article-title>. <source><italic>Transactions of the Association for Computational Linguistics</italic></source>, <volume>7</volume>: <fpage>249</fpage>–<lpage>266</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1162/tacl_a_00266" xlink:type="simple">https://doi.org/10.1162/tacl_a_00266</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_052">
<mixed-citation publication-type="journal"> <string-name><surname>Richard</surname> <given-names>OC</given-names></string-name> (<year>2000</year>). <article-title>Racial diversity, business strategy, and firm performance: A resource-based view</article-title>. <source><italic>Academy of Management Journal</italic></source>, <volume>43</volume>(<issue>2</issue>): <fpage>164</fpage>–<lpage>177</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.2307/1556374" xlink:type="simple">https://doi.org/10.2307/1556374</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_053">
<mixed-citation publication-type="book"> <string-name><surname>Rochon</surname> <given-names>TR</given-names></string-name> (<year>2000</year>). <source><italic>Culture Moves: Ideas, Activism, and Changing Values</italic></source>. <publisher-name>Princeton University Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_054">
<mixed-citation publication-type="journal"> <string-name><surname>Rodriguez</surname> <given-names>MY</given-names></string-name>, <string-name><surname>Storer</surname> <given-names>H</given-names></string-name> (<year>2020</year>). <article-title>A computational social science perspective on qualitative data exploration: Using topic models for the descriptive analysis of social media data</article-title>. <source><italic>Journal of Technology in Human Services</italic></source>, <volume>38</volume>(<issue>1</issue>): <fpage>54</fpage>–<lpage>86</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/15228835.2019.1616350" xlink:type="simple">https://doi.org/10.1080/15228835.2019.1616350</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_055">
<mixed-citation publication-type="book"> <string-name><surname>Rose</surname> <given-names>A</given-names></string-name> (<year>2009</year>). <source><italic>Gender and Victorian Reform</italic></source>. <publisher-name>Cambridge Scholars Publishing</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_056">
<mixed-citation publication-type="journal"> <string-name><surname>Rosenmann</surname> <given-names>A</given-names></string-name> (<year>2016</year>). <article-title>Alignment with globalized western culture: Between inclusionary values and an exclusionary social identity</article-title>. <source><italic>European Journal of Social Psychology</italic></source>, <volume>46</volume>(<issue>1</issue>): <fpage>26</fpage>–<lpage>43</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/ejsp.2130" xlink:type="simple">https://doi.org/10.1002/ejsp.2130</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_057">
<mixed-citation publication-type="journal"> <string-name><surname>Setzler</surname> <given-names>M</given-names></string-name> (<year>2019</year>). <article-title>Measuring bias against female political leadership</article-title>. <source><italic>Politics &amp; Gender</italic></source>, <volume>15</volume>(<issue>4</issue>): <fpage>695</fpage>–<lpage>721</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1017/S1743923X18000430" xlink:type="simple">https://doi.org/10.1017/S1743923X18000430</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_058">
<mixed-citation publication-type="other"> <string-name><surname>Siblini</surname> <given-names>W</given-names></string-name>, <string-name><surname>Pasqual</surname> <given-names>C</given-names></string-name>, <string-name><surname>Lavielle</surname> <given-names>A</given-names></string-name>, <string-name><surname>Cauchois</surname> <given-names>C</given-names></string-name> (2019). Multilingual question answering from formatted text applied to conversational agents. arXiv preprint: <uri>https://arxiv.org/abs/1910.04659</uri></mixed-citation>
</ref>
<ref id="j_jds1100_ref_059">
<mixed-citation publication-type="other"> <string-name><surname>Stathoulopoulos</surname> <given-names>K</given-names></string-name>, <string-name><surname>Mateos-Garcia</surname> <given-names>JC</given-names></string-name> (2019). Gender diversity in ai research. <uri>https://media.nesta.org.uk/documents/Gender_Diversity_in_AI_Research.pdf</uri>. Available at SSRN 3428240.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_060">
<mixed-citation publication-type="book"> <string-name><surname>Stevenson</surname> <given-names>RL</given-names></string-name> (<year>1883</year>). <source><italic>Treasure Island</italic></source>. <publisher-name>Cassell &amp; Co.</publisher-name></mixed-citation>
</ref>
<ref id="j_jds1100_ref_061">
<mixed-citation publication-type="journal"> <string-name><surname>Tusan</surname> <given-names>ME</given-names></string-name> (<year>2004</year>). <article-title>Performing work: Gender, class, and the printing trade in victorian britain</article-title>. <source><italic>Journal of Women’s History</italic></source>, <volume>16</volume>(<issue>1</issue>): <fpage>103</fpage>–<lpage>126</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1353/jowh.2004.0037" xlink:type="simple">https://doi.org/10.1353/jowh.2004.0037</ext-link></mixed-citation>
</ref>
<ref id="j_jds1100_ref_062">
<mixed-citation publication-type="chapter"> <string-name><surname>Wolf</surname> <given-names>T</given-names></string-name>, <string-name><surname>Debut</surname> <given-names>L</given-names></string-name>, <string-name><surname>Sanh</surname> <given-names>V</given-names></string-name>, <string-name><surname>Chaumond</surname> <given-names>J</given-names></string-name>, <string-name><surname>Delangue</surname> <given-names>C</given-names></string-name>, <string-name><surname>Moi</surname> <given-names>A</given-names></string-name>, <etal>et al.</etal> (<year>2020</year>). <chapter-title>Transformers: State-of-the-art natural language processing</chapter-title>. In: <source><italic>Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</italic></source>, <fpage>38</fpage>–<lpage>45</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_063">
<mixed-citation publication-type="chapter"> <string-name><surname>Wood-Doughty</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Smith</surname> <given-names>M</given-names></string-name>, <string-name><surname>Broniatowski</surname> <given-names>D</given-names></string-name>, <string-name><surname>Dredze</surname> <given-names>M</given-names></string-name> (<year>2017</year>). <chapter-title>How does twitter user behavior vary across demographic groups?</chapter-title> In: <source><italic>Proceedings of the Second Workshop on NLP and Computational Social Science</italic></source>, <fpage>83</fpage>–<lpage>89</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1100_ref_064">
<mixed-citation publication-type="journal"> <string-name><surname>Yang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>J</given-names></string-name> (<year>2020</year>). <article-title>Measuring female representation and impact in films over time</article-title>. <source><italic>ACM Transactions on Data Science</italic></source>, <volume>1</volume>(<issue>4</issue>): <fpage>1</fpage>–<lpage>14</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/3411213" xlink:type="simple">https://doi.org/10.1145/3411213</ext-link></mixed-citation>
</ref>
</ref-list>
</back>
</article>
