Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 21, Issue 4 (2023)
  4. EVIboost for the Estimation of Extreme V ...

Journal of Data Science

Submit your article Information
  • Article info
  • More
    Article info

EVIboost for the Estimation of Extreme Value Index Under Heterogeneous Extremes
Volume 21, Issue 4 (2023), pp. 638–657
Jiaxi Wang   Yanxi Hou   Xingchi Li     All authors (4)

Authors

 
Placeholder
https://doi.org/10.6339/22-JDS1067
Pub. online: 3 October 2022      Type: Statistical Data Science      Open accessOpen Access

Received
18 June 2022
Accepted
16 September 2022
Published
3 October 2022

Abstract

Modeling heterogeneity on heavy-tailed distributions under a regression framework is challenging, yet classical statistical methodologies usually place conditions on the distribution models to facilitate the learning procedure. However, these conditions will likely overlook the complex dependence structure between the heaviness of tails and the covariates. Moreover, data sparsity on tail regions makes the inference method less stable, leading to biased estimates for extreme-related quantities. This paper proposes a gradient boosting algorithm to estimate a functional extreme value index with heterogeneous extremes. Our proposed algorithm is a data-driven procedure capturing complex and dynamic structures in tail distributions. We also conduct extensive simulation studies to show the prediction accuracy of the proposed algorithm. In addition, we apply our method to a real-world data set to illustrate the state-dependent and time-varying properties of heavy-tail phenomena in the financial industry.

Supplementary material

 Supplementary Material
The following files are included in the supplementary material: (1) Programs for modeling TIR and EVIboost; (2) Code files for simulation study, along with the detailed experiment results; (3) Code and data files for financial data analysis via EVIboost; (4) Simulation results on computational time of the EVIboost model.

References

 
Adrian T, Brunnermeier MK (2016). Covar. The American Economic Review, 106: 1705–1741.
 
Breiman L, Friedman J, Stone C, Olshen R (1984). Classification and Regression Trees. CRC Press, Abingdon, United Kingdom.
 
Clauset A, Shalizi CR, Newman MEJ (2009). Power-law distributions in empirical data. SIAM Review, 51(4): 661–703.
 
De Haan L, Ferreira A (2006). Extreme Value Theory: An Introduction (Vol. 21). Springer, New York.
 
Dekkers A, De Haan L (1989). On the estimation of the extreme-value index and large quantile estimation. The Annals of Statistics, 17(4): 1795–1832.
 
DiCiccio TJ, Efron B (1996). Bootstrap confidence intervals. Statistical Science, 11(3): 189–228.
 
Einmahl J, De Haan L, Zhou C (2016). Statistics of heteroscedastic extremes. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 78(1): 31–51.
 
Friedman J (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(6): 1189–1232.
 
Gençay R, Selçuk F, Ulugülyaǧci A (2003). High volatility, thick tails and extreme value theory in value-at-risk estimation. Insurance. Mathematics & Economics, 33(2): 337–356.
 
Hall P (1982). On some simple estimates of an exponent of regular variation. Journal of the Royal Statistical Society, Series B, Methodological, 44(1): 37–42.
 
Hill B (1975). A simple general approach to inference about the tail of a distribution. The Annals of Statistics, 3(5): 1163–1174.
 
Meinshausen N (2006). Quantile regression forests. Journal of Machine Learning Research, 7(35): 983–999.
 
Mentch L, Hooker G (2016). Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. Journal of Machine Learning Research, 17(26): 1–41.
 
Sandri M, Zuccolotto P (2008). A bias correction algorithm for the Gini variable importance measure in classification trees. Journal of Computational and Graphical Statistics, 17(3): 611–628.
 
Wager S, Hastie T, Hastie T (2014). Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. Journal of Machine Learning Research, 15(1): 1625–1651.
 
Wang H, Tsai C (2009). Tail index regression. Journal of the American Statistical Association, 104(487): 1233–1240.
 
White AP, Liu WZ (1994). Technical mote: Bias in information-based measures in decision tree induction. Machine Learning, 15(3): 321–329.
 
Xu W, Hou Y, Li D (2020). Prediction of extremal expectile based on regression models with heteroscedastic extremes. Journal of Business & Economic Statistics, 40(2): 522–536.
 
Yang Y, Qian W, Zou H (2018). Insurance premium prediction via gradient tree-boosted Tweedie compound Poisson models. Journal of Business & Economic Statistics, 36(3): 456–470.
 
Zhang T, Yu B (2005). Boosting with early stopping: Convergence and consistency. The Annals of Statistics, 33(4): 1538–1579.
 
Zhao Z, Zhang Z, Chen R (2018). Modeling maxima with autoregressive conditional Fréchet model. Journal of Econometrics, 207(2): 325–351.

PDF XML
PDF XML

Copyright
2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
gradient boosting heterogeneous extremes Pareto model tail estimation tree-based method

Funding
Yanxi Hou’s research was partly supported by the National Natural Science Foundation of China Grant 72171055 and the Natural Science Foundation of Shanghai Grant 20ZR1403900.

Metrics
since February 2021
702

Article info
views

292

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy