Q-learning with Compound Outcome and Mixed Misclassification and Measurement Error in Covariates

Khadem Charvadeh, Yasin; Yi, Grace Y.

doi:10.6339/25-JDS1200

Journal of Data Science

Q-learning with Compound Outcome and Mixed Misclassification and Measurement Error in Covariates

Yasin Khadem Charvadeh

Grace Y. Yi

https://doi.org/10.6339/25-JDS1200

Pub. online: 15 October 2025 Type: Statistical Data Science

Open Access

Received
30 August 2024

Accepted
21 September 2025

Published
15 October 2025

Abstract

Precision medicine is an innovative approach that aims to customize medical treatments and interventions to patients based on their individual characteristics. Several estimation techniques, including Q-learning, have been developed to determine optimal treatment rules. However, the applicability of these methods depends on the availability of precisely measured variables. This study extends the scope of Q-learning to incorporate compound outcomes, deviating from the commonly assumed univariate outcomes, and further accommodates data with mismeasurement in both binary and continuous covariates. Two methods are described to mitigate the impact of mismeasurement. Numerical studies reveal that mismeasurement in covariates leads to notable estimation bias in parameters indexing the optimal treatment, yet the methods addressing the mismeasured effects yield improved results.

Supplementary material

Supplementary Material

S1. An Example of Constructing S K j ∗ ( θ K j ; Y K j i , A ‾ K i , X ‾ K i ∗ , C ‾ K i ∗ , Z ‾ K i ) S2. Proportion of Optimally Treated Future Patients S3. Simulation Results for Correction Strategies with Reduced Sample Size S4. Simulation Results for Correction Strategies with Reduced Validation Subsample Size S5. Data Analysis

References

Akazawa K, Kinukawa N, Nakamura T (1998). A note on the corrected score function adjusting for misclassification. Journal of the Japan Statistical Society, 28(1): 115–123. https://doi.org/10.14490/jjss1995.28.115

Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006). Measurement Error in Nonlinear Models: A Modern Perspective. CRC press.

Chakraborty B, Moodie EE (2013). Statistical Methods for Dynamic Treatment Regimes. Springer.

Henmi M, Eguchi S (2004). A paradox concerning nuisance parameters and projected estimating functions. Biometrika, 91(4): 929–941. https://doi.org/10.1093/biomet/91.4.929

Khadem Charvadeh Y, Yi GY (2024a). Accommodating misclassification effects on optimizing dynamic treatment regimes with Q-learning. Statistics in Medicine, 43(3): 578–605. https://doi.org/10.1002/sim.9973

Khadem Charvadeh Y, Yi GY (2024b). Understanding effective virus control policies for COVID-19 with the Q-learning method. Statistics in Biosciences, 16(1): 265–289. https://doi.org/10.1007/s12561-023-09382-w

Lizotte DJ, Bowling MH, Murphy SA (2010). Efficient reinforcement learning with multiple reward functions for randomized controlled trial analysis. In: Twenty-Seventh International Conference on Machine Learning (ICML), 695–702.

Ning Y, Yi GY, Reid N (2018). A class of weighted estimating equations for semiparametric transformation models with missing covariates. Scandinavian Journal of Statistics, 45(1): 87–109. https://doi.org/10.1111/sjos.12289

Robins JM (2004). Optimal structural nested models for optimal sequential decisions. In: Lin, DY, Heagerty, PJ (eds.), Proceedings of the Second Seattle Symposium in Biostatistics. Lecture Notes in Statistics, vol. 179. Springer. New York, NY. https://doi.org/10.1007/978-1-4419-9076-1_11

Robins JM, Rotnitzky A, Zhao LP (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427): 846–866. https://doi.org/10.1080/01621459.1994.10476818

Spicker D, Wallace MP (2020). Measurement error and precision medicine: Error-prone tailoring covariates in dynamic treatment regimes. Statistics in Medicine, 39(26): 3732–3755. https://doi.org/10.1002/sim.8690

Wang L, Rotnitzky A, Lin X, Millikan RE, Thall PF (2012). Evaluation of viable dynamic treatment regimes in a sequentially randomized trial of advanced prostate cancer. Journal of the American Statistical Association, 107(498): 493–508. https://doi.org/10.1080/01621459.2011.641416

Yi GY (2017). Statistical Analysis with Measurement Error or Misclassification: Strategy, Method and Application. Springer Science+Business Media LLC, New York.

Yi GY, Delaigle A, Gustafson P (2021). Handbook of Measurement Error Models. CRC Press.

2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

compound outcome dynamic treatment regimes estimating function misclassification measurement error Q-learning regression calibration regression models

Funding

Yi is the Canada Research Chair in Data Science (Tier 1). Her research is supported by funding from the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Canada Research Chairs Program.

Metrics

since February 2021

438

Article info
views

343

PDF
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file