Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 21, Issue 1 (2023)
  4. Data Science Applications and Implicatio ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Data Science Applications and Implications in Legal Studies: A Perspective Through Topic Modelling
Volume 21, Issue 1 (2023), pp. 57–67
Jinzhe Tan   Huan Wan   Ping Yan     All authors (4)

Authors

 
Placeholder
https://doi.org/10.6339/22-JDS1058
Pub. online: 4 August 2022      Type: Data Science In Action      Open accessOpen Access

Received
29 December 2021
Accepted
2 July 2022
Published
4 August 2022

Abstract

Law and legal studies has been an exciting new field for data science applications whereas the technological advancement also has profound implications for legal practice. For example, the legal industry has accumulated a rich body of high quality texts, images and other digitised formats, which are ready to be further processed and analysed by data scientists. On the other hand, the increasing popularity of data science has been a genuine challenge to legal practitioners, regulators and even general public and has motivated a long-lasting debate in the academia focusing on issues such as privacy protection and algorithmic discrimination. This paper collects 1236 journal articles involving both law and data science from the platform Web of Science to understand the patterns and trends of this interdisciplinary research field in terms of English journal publications. We find a clear trend of increasing publication volume over time and a strong presence of high-impact law and political science journals. We then use the Latent Dirichlet Allocation (LDA) as a topic modelling method to classify the abstracts into four topics based on the coherence measure. The four topics identified confirm that both challenges and opportunities have been investigated in this interdisciplinary field and help offer directions for future research.

Supplementary material

 Supplementary Material
The file “JDS_dataScienceLaw.ipynb” has the Python code used for the analysis above. The file “articles_en.csv” has the original data collected from Web of Science. The file “README.txt” has the description of the two files above.

References

 
Aletras N, Tsarapatsanis D, Preoţiuc-Pietro D, Lampos V (2016). Predicting judicial decisions of the European Court of Human Rights: A natural language processing perspective. PeerJ Computer Science, 2: e93.
 
Asmussen CB, Møller C (2019). Smart literature review: A practical topic modelling approach to exploratory literature review. Journal of Big Data, 6(1): 1–18.
 
Bibal A, Lognoul M, De Streel A, Frénay B (2021). Legal requirements on explainability in machine learning. Artificial Intelligence and Law, 29(2): 149–169.
 
Blei DM, Ng AY, Jordan MI (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3: 993–1022.
 
Bruijn LM, Vols M, Brouwer JG (2018). Home closure as a weapon in the Dutch war on drugs: Does judicial review function as a safety net? International Journal of Drug Policy, 51: 137–147.
 
Chesney B, Citron D (2019). Deep fakes: A looming challenge for privacy, democracy, and national security. California Law Review, 107: 1753.
 
Citron DK, Pasquale FA (2014). The scored society: Due process for automated predictions. Washington Law Review, 89: 1.
 
Dhami MK, Belton I (2016). Statistical analyses of court decisions: An example of multilevel models of sentencing. Law and Method, 10: 247–266.
 
Fagan F, Levmore S (2019). The impact of artificial intelligence on rules, standards, and judicial discretion. Southern California Law Review, 93: 1.
 
Frankenreiter J (2017). The politics of citations at the ECJ—policy preferences of EU member state governments and the citation behavior of judges at the European Court of Justice. Journal of Empirical Legal Studies, 14(4): 813–857.
 
Frankenreiter J, Livermore MA (2020). Computational methods in legal analysis. Annual Review of Law and Social Science, 16: 39–57.
 
Hildebrandt M (2018). Law as computation in the era of artificial legal intelligence: Speaking law to the power of statistics. University of Toronto Law Journal, 68(supplement 1): 12–35.
 
Hod S, Chagal-Feferkorn K, Elkin-Koren N, Gal A (2022). Data science meets law. Communications of the ACM, 65(2): 35–39.
 
Katz DM, Bommarito MJ, Blackman J (2017). A general approach for predicting the behavior of the Supreme Court of the United States. PLOS ONE, 12(4): e0174698.
 
Li X, Lei L (2021). A bibliometric analysis of topic modelling studies (2000–2017). Journal of Information Science, 47(2): 161–175.
 
Madsen MR (2018). Rebalancing European human rights: Has the Brighton Declaration engendered a new deal on human rights in Europe? Journal of International Dispute Settlement, 9(2): 199–222.
 
Martin K (2019). Ethical implications and accountability of algorithms. Journal of Business Ethics, 160(4): 835–850.
 
McCallum AK (2002). Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu.
 
Medvedeva M, Üstun A, Xu X, Vols M, Wieling M (2021). Automatic judgement forecasting for pending applications of the European Court of Human Rights. In: Proceedings of the Fifth Workshop on Automated Semantic Analysis of Information in Legal Text (ASAIL 2021).
 
Medvedeva M, Vols M, Wieling M (2020). Using machine learning to predict decisions of the European Court of Human Rights. Artificial Intelligence and Law, 28(2): 237–266.
 
Miller AR (1993). Copyright protection for computer programs, databases, and computer-generated works: Is anything new since CONTU? Harvard Law Review, 106(5): 978–1073.
 
Olsen HP, Küçüksu A (2017). Finding hidden patterns in ECtHR’s case law: On how citation network analysis can improve our knowledge of ECtHR’s Article 14 practice. International Journal of Discrimination and the Law, 17(1): 4–22.
 
Quemy A, Wrembel R (2020). On integrating and classifying legal text documents. In: International Conference on Database and Expert Systems Applications, 385–399. Springer.
 
Řehůřek R, Sojka P (2010). Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, 45–50. ELRA, Valletta, Malta.
 
Röder M, Both A, Hinneburg A (2015). Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, 399–408.
 
Rosca C, Covrig B, Goanta C, van Dijck G, Spanakis G (2020). Return of the AI: An analysis of legal research on artificial intelligence using topic modeling. In: NLLP@ KDD, 3–10.
 
Šadl U, Olsen HP (2017). Can quantitative methods complement doctrinal legal studies? using citation network and corpus linguistic analysis to understand international courts. Leiden Journal of International Law, 30(2): 327–349.
 
Sunstein CR (2001). Of artificial intelligence and legal reasoning. The University of Chicago Law School Roundtable, 8(1): 29–35.
 
Surden H (2019). Artificial intelligence and law: An overview. Georgia State University Law Review, 35: 19–22.
 
Tarissan F, Nollez-Goldbach R (2016). Analysing the first case of the International Criminal Court from a network-science perspective. Journal of Complex Networks, 4(4): 616–634.
 
Tene O, Polonetsky J (2011). Privacy in the age of big data: A time for big decisions. Stanford Law Review, 64: 63.
 
Tene O, Polonetsky J (2017). Taming the golem: Challenges of ethical algorithmic decision-making. North Carolina Journal of Law and Technology, 19: 125.
 
Wendel WB (2019). The promise and limitations of artificial intelligence in the practice of law. Oklahoma Law Review, 72: 21.
 
Wieringa M (2020). What to account for when accounting for algorithms: A systematic literature review on algorithmic accountability. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 1–18.
 
Wyner A, Mochales-Palau R, Moens MF, Milward D (2010). Approaches to text mining arguments from legal cases. In: Semantic Processing of Legal Texts: Where the Language of Law Meets the Law of Language (E Francesconi, S Montemagni, W Peters, D Tiscornia, eds.), 60–79. Springer, Berlin Heidelberg.
 
Zheng GG (2020). China’s grand design of people’s smart courts. Asian Journal of Law and Society, 7(3): 561–582.

Related articles PDF XML
Related articles PDF XML

Copyright
2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
artificial intelligence law literature review text mining

Metrics
since February 2021
1398

Article info
views

628

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy