Journal of Data Science logo


Login Register

  1. Home
  2. To appear
  3. Estimating Healthcare Expenditure Using ...

Journal of Data Science

Submit your article Information
  • Article info
  • More
    Article info

Estimating Healthcare Expenditure Using Parametric Change Point Models
Indranil Ghosh   Qi Zheng   Michael E Egger     All authors (4)

Authors

 
Placeholder
https://doi.org/10.6339/24-JDS1157
Pub. online: 3 December 2024      Type: Data Science In Action      Open accessOpen Access

Received
1 July 2024
Accepted
12 October 2024
Published
3 December 2024

Abstract

Estimating healthcare expenditures is important for policymakers and clinicians. The expenditure of patients facing a life-threatening illness can often be segmented into four distinct phases: diagnosis, treatment, stable, and terminal phases. The diagnosis phase encompasses healthcare expenses incurred prior to the disease diagnosis, attributed to frequent healthcare visits and diagnostic tests. The second phase, following diagnosis, typically witnesses high expenditure due to various treatments, gradually tapering off over time and stabilizing into a stable phase, and eventually to a terminal phase. In this project, we introduce a pre-disease phase preceding the diagnosis phase, serving as a baseline for healthcare expenditure, and thus propose a five-phase to evaluate the healthcare expenditures. We use a piecewise linear model with three population-level change points and $4p$ subject-level parameters to capture expenditure trajectories and identify transitions between phases, where p is the number of covariates. To estimate the model’s coefficients, we apply generalized estimating equations, while a grid-search approach is used to estimate the change-point parameters by minimizing the residual sum of squares. In our analysis of expenditures for stages I–III pancreatic cancer patients using the SEER-Medicare database, we find that the diagnostic phase begins one month before diagnosis, followed by an initial treatment phase lasting three months. The stable phase continues until eight months before death, at which point the terminal phase begins, marked by a renewed increase in expenditures.

Supplementary material

 Supplementary Material
R Codes for Key Steps of the Case Study

References

 
Austin PC (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3): 399–424. https://doi.org/10.1080/00273171.2011.568786
 
Bang H, Tsiatis AA (2002). Median regression with censored cost data. Biometrics, 58(3): 643–649. https://doi.org/10.1111/j.0006-341X.2002.00643.x
 
Başer O, Gardiner JC, Bradley CJ, Given CW (2004). Estimation from censored medical cost data. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 46(3): 351–363. https://doi.org/10.1002/bimj.200210036
 
Basu A, Polsky D, Manning WG (2011). Estimating treatment effects on healthcare costs under exogeneity: is there a ‘magic bullet’? Health Services and Outcomes Research Methodology, 11(1–2): 1–26. https://doi.org/10.1007/s10742-011-0072-8
 
Brown ML, Riley GF, Schussler N, Etzioni R (2002). Estimating health care costs related to cancer treatment from SEER-Medicare data. Medical Care, 40(8): IV104–IV117.
 
Enewold L, Parsons H, Zhao L, Bott D, Rivera DR, Barrett MJ, et al. (2020). Updated overview of the SEER-Medicare data: enhanced content and applications. JNCI Monographs, 2020(55): 3–13.
 
Inan G, Wang L (2017). PGEE: an R package for analysis of longitudinal data with high-dimensional covariates. R Journal, 9(1): 393. https://doi.org/10.32614/RJ-2017-030
 
Klabunde CN, Potosky AL, Legler JM, Warren JL (2000). Development of a comorbidity index using physician claims data. Journal of Clinical Epidemiology, 53(12): 1258–1267. https://doi.org/10.1016/S0895-4356(00)00256-0
 
Li J, Handorf E, Bekelman J, Mitra N (2016). Propensity score and doubly robust methods for estimating the effect of treatment on censored cost. Statistics in Medicine, 35(12): 1985–1999. https://doi.org/10.1002/sim.6842
 
Lin D, Feuer E, Etzioni R, Wax Y (1997). Estimating medical costs from incomplete follow-up data. Biometrics, 53(2): 419–434. https://doi.org/10.2307/2533947
 
Manning WG, Mullahy J (2001). Estimating log models: to transform or not to transform? Journal of Health Economics, 20(4): 461–494. https://doi.org/10.1016/S0167-6296(01)00086-8
 
Mihaylova B, Briggs A, O’Hagan A, Thompson SG (2011). Review of statistical methods for analysing healthcare resources and costs. Health Economics, 20(8): 897–916. https://doi.org/10.1002/hec.1653
 
NCI (2014). SEER-medicare: Selecting the appropriate comorbidity SAS macro.
 
Paulus MT, Claridge DE, Culp C (2015). Algorithm for automating the selection of a temperature dependent change point model. Energy and Buildings, 87: 95–104. https://doi.org/10.1016/j.enbuild.2014.11.033
 
Reeves J, Chen J, Wang XL, Lund R, Lu QQ (2007). A review and comparison of changepoint detection techniques for climate data. Journal of Applied Meteorology and Climatology, 46(6): 900–915. https://doi.org/10.1175/JAM2493.1
 
Roth WE (1934). On direct product matrices. Bulletin of the American Mathematical Society, 40(6): 461–468. https://doi.org/10.1090/S0002-9904-1934-05899-3
 
Tramontano AC, Chen Y, Watson TR, Eckel A, Sheehan DF, Peters MLB, et al. (2019). Pancreatic cancer treatment costs, including patient liability, by phase of care and treatment modality, 2000–2013. Medicine, 98(49): e18082.
 
US Department of Labor Bureau of Labor Statistic (2021). Consumer price index data.
 
Wang L, Zhou J, Qu A (2012). Penalized generalized estimating equations for high-dimensional longitudinal data analysis. Biometrics, 68(2): 353–360. https://doi.org/10.1111/j.1541-0420.2011.01678.x
 
Wijeysundera HC, Wang X, Tomlinson G, Ko DT, Krahn MD (2012). Techniques for estimating health care costs with censored data: an overview for the health services researcher. ClinicoEconomics and Outcomes Research: CEOR, 4: 145. https://doi.org/10.2147/CEOR.S31552

PDF XML
PDF XML

Copyright
2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
changepoint models healthcare expenditures pancreatic cancer phase-based expenditure SEER-Medicare

Funding
M.E. Egger and M. Kong thank the American Cancer Society for their generous support of this study (CSDG-22-125-01-HOPS). M. Kong also acknowledges the support from the Wendell Cherry Chair in Clinical Trial Research endowment funds at the University of Louisville, along with funding from the National Institute of Health (P30ES030283, R01HL158779, and P20GM155899). Q. Zheng appreciates the support from the National Institute of Health (R21AG070659) and the National Science Foundation (DMS-1952486).

Metrics
since February 2021
292

Article info
views

69

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy