Causal Inference: A Tale of Three Frameworks
Volume 24, Issue 1 (2026): Special Issue: Statistical aspects of Trustworthy Machine Learning, pp. 53–85
Pub. online: 11 February 2026
Type: Data Science Reviews
Open Access
Received
8 September 2025
8 September 2025
Accepted
8 December 2025
8 December 2025
Published
11 February 2026
11 February 2026
Abstract
Causal inference is a central goal across many scientific disciplines. Over the past several decades, three major frameworks have emerged to formalize causal questions and guide their analysis: the potential outcomes framework, structural equation models, and directed acyclic graphs. Although these frameworks differ in language, assumptions, and philosophical orientation, they often lead to compatible or complementary insights. This paper provides a comparative introduction to the three frameworks, clarifying their connections, highlighting their distinct strengths and limitations, and illustrating how they can be used together in practice. The discussion is aimed at researchers and graduate students with some background in statistics or causal inference who are seeking a conceptual foundation for applying causal methods across a range of substantive domains.
References
Aldrich J (1989). Autonomy. Oxford Economic Papers, 41(1): 15–34. https://doi.org/10.1093/oxfordjournals.oep.a041889
Angrist JD, Imbens GW, Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91(434): 444–455. https://doi.org/10.1080/01621459.1996.10476902
Arjovsky M, Bottou L, Gulrajani I, Lopez-Paz D (2020). Invariant risk minimization. arXiv preprint: https://arxiv.org/abs/1907.02893, Original version 2019.
Bajari P, Burdick B, Imbens GW, Masoero L, McQueen J, Richardson TS, et al. (2023). Experimental design in marketplaces. Statistical Science, 38(3): 458–476. https://doi.org/10.1214/23-STS883
Balke A, Pearl J (1997). Bounds on treatment effects from studies with imperfect compliance. Journal of the American Statistical Association, 92(439): 1171–1176. https://doi.org/10.1080/01621459.1997.10474074
Chen HY (2007). A semiparametric odds ratio model for measuring association. Biometrics, 63(2): 413–421. https://doi.org/10.1111/j.1541-0420.2006.00701.x
Cole SR, Frangakis CE (2009). The consistency statement in causal inference: A definition or an assumption? Epidemiology, 20(1): 3–5. https://doi.org/10.1097/EDE.0b013e31818ef366
Dawid AP (2000). Causal inference without counterfactuals. Journal of the American Statistical Association, 95(450): 407–424. https://doi.org/10.1080/01621459.2000.10474210
Dawid AP (2015). Statistical causality from a decision-theoretic perspective. Annual Review of Statistics and Its Application, 2(1): 273–303. https://doi.org/10.1146/annurev-statistics-010814-020105
Ding P, Geng Z, Yan W, Zhou X-H (2011). Identifiability and estimation of causal effects by principal stratification with outcomes truncated by death. Journal of the American Statistical Association, 106(496): 1578–1591. https://doi.org/10.1198/jasa.2011.tm10265
Dong M, Liu L, Tang D, Liu G, Xu W, Wang L (2025). Marginal causal effect estimation with continuous instrumental variables. arXiv preprint: https://arxiv.org/abs/2510.14368.
Geiger D, Verma T, Pearl J (1990). Identifying independence in Bayesian networks. Networks, 20(5): 507–534. https://doi.org/10.1002/net.3230200504
Greenland S (2004). An overview of methods for causal inference from observational studies. In: Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives: An Essential Journey with Donald Rubin’s Statistical Family (A Gelman, X-L Meng, eds.), pages 1–13. John Wiley & Sons, Chichester, UK.
Greenland S, Brumback B (2002). An overview of relations among causal modelling methods. International Journal of Epidemiology, 31(5): 1030–1037. https://doi.org/10.1093/ije/31.5.1030
Greenland S, Poole C (1988). Invariants and noninvariants in the concept of interdependent effects. Scandinavian Journal of Work, Environment & Health, 14(2): 125–129. https://doi.org/10.5271/sjweh.1945
Haavelmo T (1943). The statistical implications of a system of simultaneous equations. Econometrica, 11(1): 1–12. https://doi.org/10.2307/1905714
Halpern JY, Pearl J (2000). Axiomatizing causal reasoning. Journal of Artificial Intelligence Research, 12: 317–337. https://doi.org/10.1613/jair.648
Hartwig FP, Wang L, Smith GD, Davies NM (2023). Average causal effect estimation via instrumental variables: The no simultaneous heterogeneity assumption. Epidemiology, 34(3): 325–332. https://doi.org/10.1097/EDE.0000000000001596
Hernán MA, Robins JM (2016). Using big data to emulate a target trial when a randomized trial is not available. American Journal of Epidemiology, 183(8): 758–764. https://doi.org/10.1093/aje/kwv254
Hernán MA, Wang W, Leaf DE (2022). Target trial emulation: a framework for causal inference from observational data. Journal of the American Medical Association, 328(24): 2446–2447. https://doi.org/10.1001/jama.2022.21383
Hill AB (1965). The environment and disease: Association or causation? Proceedings of the Royal Society of Medicine, 58: 295–300. https://doi.org/10.1177/003591576505800503
Holland PW (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396): 945–960. https://doi.org/10.1080/01621459.1986.10478354
Imai K, Keele L, Yamamoto T (2010). Identification, inference and sensitivity analysis for causal mediation effects. Statistical Science, 25(1): 51–71. https://doi.org/10.1214/10-STS321
Imbens GW, Wooldridge JM (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47(1): 5–86. https://doi.org/10.1257/jel.47.1.5
Janzing D, Schlölkopf B (2010). Causal inference using the algorithmic Markov condition. IEEE Transactions on Information Theory, 56(10): 5168–5194. https://doi.org/10.1109/TIT.2010.2060095
Jiao L, Wang Y, Liu X, Li L, Liu F, Ma W, et al. (2024). Causal inference meets deep learning: A comprehensive survey. Research, 7: 0467. https://doi.org/10.34133/research.0467
Jun SJ, Lee S (2023). Identifying the effect of persuasion. Journal of Political Economy, 131(8): 2032–2058. https://doi.org/10.1086/724114
Kuang K, Li L, Geng Z, Xu L, Zhang K, Liao B, et al. (2020). Causal inference. Engineering, 6: 253–263. https://doi.org/10.1016/j.eng.2019.08.016
Kusner MJ, Loftus J, Russell C, Silva R (2017). Counterfactual fairness. In: Advances in Neural Information Processing Systems 30 (I Guyon, U von Luxburg, S Bengio, H Wallach, R Fergus SVN Vishwanathan R Garnett, eds.), 4066–4076. Curran Associates, Inc. Fergus, Rob and Vishwanathan, S. V. N. and Garnett, Roman.
Lauritzen SL (2004). Discussion on causality. Discussion of A. P. Dawid’s “Probability, causality, and the empirical world: A Bayes–de Finetti–Popper–Borel synthesis”. Scandinavian Journal of Statistics, 31(2): 189–192. https://doi.org/10.1111/j.1467-9469.2004.03-200A.x
Lauritzen SL, Dawid AP, Larsen BN, Leimer H-G (1990). Independence properties of directed Markov fields. Networks, 20(5): 491–505. https://doi.org/10.1002/net.3230200503
Malinsky D, Shpitser I, Richardson T (2019). A potential outcomes calculus for identifying conditional path-specific effects. In: Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (K Chaudhuri, M Sugiyama, eds.), volume 89 of Proceedings of Machine Learning Research, pages 3080–3088. PMLR.
Manski CF (1993). Identification problems in the social sciences. Sociological Methodology, 23: 1–56. https://doi.org/10.2307/271005
Neison F (1844). On a method recently proposed for conducting inquiries into the comparative sanatory condition of various districts, with illustrations, derived from numerous places in Great Britain at the period of the last census. Journal of the Statistical Society of London, 7(1): 40–68. https://doi.org/10.2307/2337745
Pearl J (1995). Causal diagrams for empirical research. Biometrika, 82(4): 669–688. https://doi.org/10.1093/biomet/82.4.702
Pearl J (2010a). An introduction to causal inference. The International Journal of Biostatistics, 6(2): 7. https://doi.org/10.2202/1557-4679.1203
Pearl J (2010b). On the consistency rule in causal inference: Axiom, definition, assumption, or theorem? Epidemiology, 21(6): 872–875. https://doi.org/10.1097/EDE.0b013e3181f5d3fd
Pearl J (2019). The seven tools of causal inference, with reflections on machine learning. Communications of the ACM, 62(3): 54–60. https://doi.org/10.1145/3241036
Peters J, Bühlmann P (2014). Identifiability of Gaussian structural equation models with equal error variances. Biometrika, 101(1): 219–228. https://doi.org/10.1093/biomet/ast043
Peters J, Bühlmann P, Meinshausen N (2016). Causal inference by using invariant prediction: Identification and confidence intervals. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 78(5): 947–1012. https://doi.org/10.1111/rssb.12167
Richardson TS (2003). Markov properties for acyclic directed mixed graphs. Scandinavian Journal of Statistics, 30(1): 145–157. https://doi.org/10.1111/1467-9469.00323
Richardson TS, Evans RJ, Robins JM, Shpitser I (2023). Nested Markov properties for acyclic directed mixed graphs. The Annals of Statistics, 51(1): 334–361. https://doi.org/10.1214/22-AOS2253
Richardson TS, Robins JM (2014). ACE bounds; SEMs with equilibrium conditions. Statistical Science, 29(3): 363–366. https://doi.org/10.1214/14-STS485
Richardson TS, Robins JM (2023). Potential outcome and decision theoretic foundations for statistical causality. Journal of Causal Inference, 11(1): 20220012. https://doi.org/10.1515/jci-2022-0012
Richardson TS, Spirtes P (2002). Ancestral graph Markov models. The Annals of Statistics, 30(4): 962–1030. https://doi.org/10.1214/aos/1031689015
Robins J, Greenland S (1989). The probability of causation under a stochastic model for individual risk. Biometrics, 45(4): 1125–1138. https://doi.org/10.2307/2531765
Robins J, Greenland S (1991). Estimability and estimation of expected years of life lost due to a hazardous exposure. Statistics in Medicine, 10(1): 79–93. https://doi.org/10.1002/sim.4780100113
Robins JM (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling, 7(9–12): 1393–1512. https://doi.org/10.1016/0270-0255(86)90088-6
Robins JM, Greenland S (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology, 3(2): 143–155. https://doi.org/10.1097/00001648-199203000-00013
Robins JM, Hernán MA, Siebert U (2004). Effects of multiple interventions. In: Comparative Quantification of Health Risks: Global and Regional Burden of Disease Attributable to Selected Major Risk Factors (M Ezzati, AD Lopez, A Rodgers, CJL Murray, eds.), pages 2191–2230. World Health Organization, Geneva.
Rothman KJ (1976). Causes. American Journal of Epidemiology, 104(6): 587–592. https://doi.org/10.1093/oxfordjournals.aje.a112335
Roy AD (1951). Some thoughts on the distribution of earnings. Oxford Economic Papers, 3(2): 135–146. https://doi.org/10.1093/oxfordjournals.oep.a041827
Rubin DB (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5): 688–701. https://doi.org/10.1037/h0037350
Rubin DB (1980). Randomization analysis of experimental data: The Fisher randomization test comment. Journal of the American Statistical Association, 75(371): 591–593. https://doi.org/10.2307/2287653
Rubin DB (2004). Direct and indirect causal effects via potential outcomes. Scandinavian Journal of Statistics, 31(2): 161–170. https://doi.org/10.1111/j.1467-9469.2004.02-123.x
Rubin DB (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469): 322–331. https://doi.org/10.1198/016214504000001880
Strotz RH, Wold HOA (1960). Recursive and nonrecursive systems of equations. Econometrica, 28(2): 417–427. https://doi.org/10.2307/1907731
Tchetgen Tchetgen EJ, Wang L, Sun B (2018). Discrete choice models for nonmonotone nonignorable missing data: Identification and inference. Statistica Sinica, 28(4): 2069. https://doi.org/10.5705/ss.202016.0325
Tjoa E, Guan C (2021). A survey on explainable artificial intelligence (XAI): Toward medical XAI. IEEE Transactions on Neural Networks and Learning Systems, 32(11): 4793–4813. https://doi.org/10.1109/TNNLS.2020.3027314
Uhler C, Raskutti G, Bühlmann P, Yu B (2013). Geometry of the faithfulness assumption in causal inference. The Annals of Statistics, 41(2): 436–463. https://doi.org/10.1214/12-AOS1080
VanderWeele TJ (2009). Concerning the consistency assumption in causal inference. Epidemiology, 20(6): 880–883. https://doi.org/10.1097/EDE.0b013e3181bd5638
VanderWeele TJ, Richardson TS (2012). General theory for interactions in sufficient cause models with dichotomous exposures. The Annals of Statistics, 40(4): 2128. https://doi.org/10.1214/12-AOS1019
VanderWeele TJ, Robins JM (2009). Minimal sufficient causation and directed acyclic graphs. The Annals of Statistics, 37(3): 1437–1465. https://doi.org/10.1214/08-AOS613
Wang L (2022). On the homogeneity of measures for binary associations. arXiv preprint: https://arxiv.org/abs/2210.05179.
Wang L, Richardson TS, Zhou X-H (2017a). Causal analysis of ordinal treatments and binary outcomes under truncation by death. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 79(3): 719–735. https://doi.org/10.1111/rssb.12188
Wang L, Tchetgen Tchetgen EJ (2018). Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variables. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 80(3): 531–550. https://doi.org/10.1111/rssb.12262
Wang L, Zhou X-H, Richardson TS (2017b). Identification and estimation of causal effects with outcomes truncated by death. Biometrika, 104(3): 597–612. https://doi.org/10.1093/biomet/asx034
Wu P, Wang L (2026). Position: A potential outcomes perspective on Pearl’s causal hierarchy. arXiv preprint: https://arxiv.org/abs/2601.20405
Yang S, Wang L, Ding P (2019). Causal inference with confounders missing not at random. Biometrika, 106(4): 875–888. https://doi.org/10.1093/biomet/asz048
Yao L, Chu Z, Li S, Li Y, Gao J, Zhang A (2021). A survey on causal inference. ACM Transactions on Knowledge Discovery from Data, 15(5): 1–46. https://doi.org/10.1145/3444944
Zhou Y, Tang D, Kong D, Wang L (2024). Promises of parallel outcomes. Biometrika, 111(2): 537–550. https://doi.org/10.1093/biomet/asae008