README

This repository "discrete_extremes" contains the R code and datasets that were used to prepare the article submitted to Journal of Data Science in 2024. 

The main files to run are:
- Simulated Data.R which reproduces the simulation study in Section 3 of the article
- Word Frequency.R, Tornado.R and Multiple Birth.R that are replicating the Real Data Examples results of Section 4 

They use the following files:
- British_word_freq.txt and French_word_freq.txt, which are the datasets used for the word frequency analysis
- outbreakdata1950_2015.txt, containing the data used for the tornado analysis
- Insee_AccouchMultiples_1995_2014.txt and USMultipleBirths_1995_2014.txt, the datasets used for the multiple birth data analysis
- Functions.R and utilities_5Feb2017.R, containing auxiliary R functions

