Physician Effects in Critical Care: A Causal Inference Approach Through Propensity Weighting with Parametric and Super Learning Methods
Pub. online: 2 July 2024
Type: Data Science In Action
Open Access
Received
13 April 2024
13 April 2024
Accepted
8 June 2024
8 June 2024
Published
2 July 2024
2 July 2024
Abstract
Physician performance is critical to caring for patients admitted to the intensive care unit (ICU), who are in life-threatening situations and require high level medical care and interventions. Evaluating physicians is crucial for ensuring a high standard of medical care and fostering continuous performance improvement. The non-randomized nature of ICU data often results in imbalance in patient covariates across physician groups, making direct comparisons of the patients’ survival probabilities for each physician misleading. In this article, we utilize the propensity weighting method to address confounding, achieve covariates balance, and assess physician effects. Due to possible model misspecification, we compare the performance of the propensity weighting methods using both parametric models and super learning methods. When the generalized propensity or the quality function is not correctly specified within the parametric propensity weighting framework, super learning-based propensity weighting methods yield more efficient estimators. We demonstrate that utilizing propensity weighting offers an effective way to assess physician performance, a topic of considerable interest to hospital administrators.
Supplementary material
Supplementary MaterialThe R code for this paper can be found at the Journal of Data Science website.
References
Austin PC (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46: 399–424. https://doi.org/10.1080/00273171.2011.568786
Austin PC, Stuart EA (2015). Moving towards best practice when using inverse probability of treatment weighting (iptw) using the propensity score to estimate causal treatment effects in observational studies. Statistics in Medicine, 34: 3661–3679. https://doi.org/10.1002/sim.6607
Cole SR, Frangakis CE (2009). The consistency statement in causal inference: A definition or an assumption? Epidemiology, 20: 3–5. https://doi.org/10.1097/EDE.0b013e31818ef366
Coyle JR, Hejazi NS, Malenica I, Phillips RV, Sofrygin O (2022). sl3: Modern pipelines for machine learning and Super Learning. https://github.com/tlverse/sl3. R package version 1.4.4.
Ding P, Li F (2018). Causal inference: A missing data perspective. Statistical Science, 33: 214–237. https://doi.org/10.1214/18-STS645
Imbens GW (2000). The role of the propensity score in estimating dose-response functions. Biometrika, 87: 706–710. https://doi.org/10.1093/biomet/87.3.706
Imbens GW (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Economics and Statistics, 86: 4–29. https://doi.org/10.1162/003465304323023651
Lee BK, Lessler J, Stuart EA (2010). Improving propensity score weighting using machine learning. Statistics in Medicine, 29: 337–346. https://doi.org/10.1002/sim.3782
Lei L, Candès EJ (2021). Conformal inference of counterfactuals and individual treatment effects. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 83: 911–938. https://doi.org/10.1111/rssb.12445
Li F, Li F (2019). Propensity score weighting for causal inference with multiple treatments. Annals of Applied Statistics, 13: 2389–2415. https://doi.org/10.1214/19-AOAS1282
Li F, Morgan KL, Zaslavsky AM (2018). Balancing covariates via propensity score weighting. Journal of the American Statistical Association, 113: 390–400. https://doi.org/10.1080/01621459.2016.1260466
Li F, Thomas LE, Li F (2019). Addressing extreme propensity scores via the overlap weights. American Journal of Epidemiology, 188: 250–257. https://doi.org/10.1093/aje/kwy201
Luedtke AR, van der Laan MJ (2016). Super-learning of an optimal dynamic treatment rule. The International Journal of Biostatistics, 12: 305–332. https://doi.org/10.1515/ijb-2015-0052
McCaffrey DF, Griffin BA, Almirall D, Slaughter ME, Ramchand R, Burgette LF (2013). A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Statistics in Medicine, 32: 3388–3414. https://doi.org/10.1002/sim.5753
McCaffrey DF, Ridgeway G, Morral AR (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9: 403–425. https://doi.org/10.1037/1082-989X.9.4.403
Petersen ML, Porter KE, Gruber S, Wang Y, Van Der Laan MJ (2012). Diagnosing and responding to violations in the positivity assumption. Statistical Methods in Medical Research, 21: 31–54. https://doi.org/10.1177/0962280210386207
Pirracchio R, Petersen ML, van der Laan MJ (2015). Improving propensity score estimators’ robustness to model misspecification using super learner. American Journal of Epidemiology, 181: 108–119. https://doi.org/10.1093/aje/kwu253
Polley EC, LeDell E, Kennedy C, Lendle S, van der Laan MJ (2021). Superlearner: Super learner prediction. https://CRAN.R-project.org/package=SuperLearner. R package version 2.0-28.
Robins JM, Hernán MA, Brumback B (2000). Marginal structural models and causal inference in epidemiology. Epidemiology, 11: 550–560. https://doi.org/10.1097/00001648-200009000-00011
Rosenbaum PR, Rubin DB (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70: 41–55. https://doi.org/10.1093/biomet/70.1.41
Rubin DB (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66: 688–701. https://doi.org/10.1037/h0037350
Schulz J, Moodie EEM (2021). Doubly robust estimation of optimal dosing strategies. Journal of the American Statistical Association, 116: 256–268. https://doi.org/10.1080/01621459.2020.1753521
Setoguchi S, Schneeweiss S, Brookhart MA, Glynn RJ, Cook EF (2008). Evaluating uses of data mining techniques in propensity score estimation: A simulation study. Pharmacoepidemiology and Drug Safety, 17: 546–555. https://doi.org/10.1002/pds.1555
Spreeuwenberg MD, Bartak A, Croon MA, Hagenaars JA, Busschbach JJV, Andrea H, et al. (2010). The multiple propensity score as control for bias in the comparison of more than two treatment arms: An introduction from a case study in mental health. Medical Care, 48: 166–174. https://doi.org/10.1097/MLR.0b013e3181c1328f
SSC (2022). Developing a physician performance model in critical care: Assessing quality and value. https://ssc.ca/en/case-study/developing-a-physician-performance-model-critical-care-assessing-quality-and-value. Accessed: 2022-09-10.
Westreich D, Lessler J, Funk MJ (2010). Propensity score estimation: Neural networks, support vector machines, decision trees (cart), and meta-classifiers as alternatives to logistic regression. Journal of Clinical Epidemiology, 63: 826–833. https://doi.org/10.1016/j.jclinepi.2009.11.020
Zhou T, Tong G, Li F, Thomas LE, Li F (2022). PSweight: An R package for propensity score weighting analysis. The R Journal, 14: 282–300. https://doi.org/10.32614/RJ-2022-011
Zhou Y, Matsouaka RA, Thomas LE (2020). Propensity score weighting under limited overlap and model misspecification. Statistical Methods in Medical Research, 29: 3721–3756. https://doi.org/10.1177/0962280220940334
Zivich PN, Breskin A (2021). Machine learning for causal inference: On the use of cross-fit estimators. Epidemiology, 32: 393–401. https://doi.org/10.1097/EDE.0000000000001332