Journal of Data Science logo


Login Register

  1. Home
  2. To appear
  3. Mortgage Prepayment Modeling via a Smoot ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Mortgage Prepayment Modeling via a Smoothing Spline State Space Model
Haoran Lu   Huimin Cheng   Ye Wang     All authors (8)

Authors

 
Placeholder
https://doi.org/10.6339/25-JDS1165
Pub. online: 30 January 2025      Type: Statistical Data Science      Open accessOpen Access

Received
25 August 2024
Accepted
6 January 2025
Published
30 January 2025

Abstract

Loan behavior modeling is crucial in financial engineering. In particular, predicting loan prepayment based on large-scale historical time series data of massive customers is challenging. Existing approaches, such as logistic regression or nonparametric regression, could only model the direct relationship between the features and the prepayments. Motivated by extracting the hidden states of loan behavior, we propose the smoothing spline state space (QuadS) model based on a hidden Markov model with varying transition and emission matrices modeled by smoothing splines. In contrast to existing methods, our method benefits from capturing the loans’ unobserved state transitions, which not only increases prediction performances but also provides more interpretability. The overall model is learned by EM algorithm iterations, and within each iteration, smoothing splines are fitted with penalized least squares. Simulation studies demonstrate the effectiveness of the proposed method. Furthermore, a real-world case study using loan data from the Federal National Mortgage Association illustrates the practical applicability of our model. The QuadS model not only provides reliable predictions but also uncovers meaningful, hidden behavior patterns that can offer valuable insights for the financial industry.

Supplementary material

 Supplementary Material
Some details of the EM algorithm for QuadS are provided in Appendix A. The code and instructions of the QuadS method are available on GitHub (https://github.com/haoranlustat/QuadS). The dataset used in the case study is publicly available from Fannie Mae Data Dynamics (https://capitalmarkets.fanniemae.com/tools-applications/data-dynamics).

References

 
Agarwal S, Chomsisengphet S, Kiefer H, Kiefer LC, Medina PC (2020). Inequality during the COVID-19 pandemic: The case of savings from mortgage refinancing. Available at SSRN 3750133.
 
Aldridge I, Avellaneda M (2019). Neural networks in finance: Design and performance. The Journal of Financial Data Science, 1(4): 39–62. https://doi.org/10.3905/jfds.2019.1.4.039
 
Bengio Y, Frasconi P (1995). An input output HMM architecture. In: Advances in Neural Information Processing Systems, volume 7, 427–434.
 
Bengio Y, Frasconi P (1996). Input-output HMMs for sequence processing. IEEE Transactions on Neural Networks, 7(5): 1231–1249. https://doi.org/10.1109/72.536317
 
Berger DW, Milbradt K, Tourre F, Vavra J (2018). Mortgage prepayment and path-dependent effects of monetary policy. Technical report, National Bureau of Economic Research.
 
Fang L, Chen Y, Zhong W, Ma P (2024). Bayesian knowledge distillation: A bayesian perspective of distillation with uncertainty quantification. In: Forty-first International Conference on Machine Learning, 12935–12956.
 
Federal Housing Finance Agency (2024). Prepayment Monitoring Report: First Quarter 2024. Technical report, Federal Housing Finance Agency.
 
Freddie Mac (2024). Primary Mortgage Market Survey (PMMS).
 
Fuster A, Hizmo A, Lambie-Hanson L, Vickery J, Willen PS (2021). How resilient is mortgage credit supply? evidence from the COVID-19 pandemic. Technical Report, National Bureau of Economic Research.
 
Gu C (2013). Smoothing Spline ANOVA Models, volume 297. Springer.
 
Gu C (2014). Smoothing spline ANOVA models: R package gss. Journal of Statistical Software, 58: 1–25. https://doi.org/10.18637/jss.v058.i05
 
Gu C, Ma P (2005). Optimal smoothing in nonparametric mixed-effect models. The Annals of Statistics, 33(3): 1357–1379. https://doi.org/10.1214/009053605000000110
 
Gu C, Wahba G (1991). Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM Journal on Scientific and Statistical Computing, 12(2): 383–398. https://doi.org/10.1137/0912021
 
Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018). A survey of methods for explaining black box models. ACM Computing Surveys, 51(5): 1–42.
 
Helwig NE, Ma P (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24(3): 715–732. https://doi.org/10.1080/10618600.2014.926819
 
Johnson K, Pasquale F, Chapman J (2019). Artificial intelligence, machine learning, and bias in finance: Toward responsible innovation. Fordham Law Review, 88: 499.
 
Kung J-Y, Wu C-C, Hsu S-Y, Lee S, Yang C-W (2010). Application of logistic regression analysis of home mortgage loan prepayment and default risk. ICIC Express Letters, 4(2): 325–331.
 
Lai TL, Su Y, Sun KH (2014). Dynamic empirical bayes models and their applications to longitudinal data analysis and prediction. Statistica Sinica, 24(4): 1505–1528.
 
Ma P, Huang JZ, Zhang N (2015). Efficient computation of smoothing splines via adaptive basis sampling. Biometrika, 102(3): 631–645. https://doi.org/10.1093/biomet/asv009
 
Maxam CL, LaCour-Little M (2001). Applied nonparametric regression techniques: Estimating prepayments on fixed-rate mortgage-backed securities. Journal of Real Estate Finance and Economics, 23(2): 139–160. https://doi.org/10.1023/A:1011102332025
 
McLachlan GJ, Krishnan T (2007). The EM Algorithm and Extensions, volume 382. John Wiley & Sons.
 
Meng C, Zhang X, Zhang J, Zhong W, Ma P (2020). More efficient approximation of smoothing splines via space-filling basis selection. Biometrika, 107(3): 723–735. https://doi.org/10.1093/biomet/asaa019
 
Ozbayoglu AM, Gudelek MU, Sezer OB (2020). Deep learning for financial applications: A survey. Applied Soft Computing, 93: 106384. https://doi.org/10.1016/j.asoc.2020.106384
 
Sirignano J, Sadhwani A, Giesecke K (2016). Deep learning for mortgage risk. arXiv preprint: https://arxiv.org/abs/1607.02470
 
Sun X, Zhong W, Ma P (2021). An asymptotic and empirical smoothing parameters selection method for smoothing spline ANOVA models in large samples. Biometrika, 108(1): 149–166. https://doi.org/10.1093/biomet/asaa047
 
Van Deventer DR, Imai K, Mesler M (2013). Advanced Financial Risk Management: Tools and Techniques for Integrated Credit Risk and Interest Rate Risk Management. John Wiley & Sons.

Related articles PDF XML
Related articles PDF XML

Copyright
2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
hidden Markov model mortgage prepayment nonparametric model smoothing spline ANOVA

Funding
This research was partially supported by supported by the U.S. National Science Foundation under grants NSF DMS-1925066, DMS-1903226, DMS-2124493, DMS-2311297, DMS-2319279, DMS-2318809, the U.S. National Institute of Health under grant R01GM152814.

Metrics
since February 2021
279

Article info
views

75

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy