Causal Inference: A Tale of Three Frameworks

Wang, Linbo; Richardson, Thomas S.; Robins, James M.

doi:10.6339/25-JDS1211

Journal of Data Science

Causal Inference: A Tale of Three Frameworks

Volume 24, Issue 1 (2026): Special Issue: Statistical aspects of Trustworthy Machine Learning, pp. 53–85

Linbo Wang

Thomas S. Richardson James M. Robins

https://doi.org/10.6339/25-JDS1211

Pub. online: 11 February 2026 Type: Data Science Reviews

Open Access

Received
8 September 2025

Accepted
8 December 2025

Published
11 February 2026

Abstract

Causal inference is a central goal across many scientific disciplines. Over the past several decades, three major frameworks have emerged to formalize causal questions and guide their analysis: the potential outcomes framework, structural equation models, and directed acyclic graphs. Although these frameworks differ in language, assumptions, and philosophical orientation, they often lead to compatible or complementary insights. This paper provides a comparative introduction to the three frameworks, clarifying their connections, highlighting their distinct strengths and limitations, and illustrating how they can be used together in practice. The discussion is aimed at researchers and graduate students with some background in statistics or causal inference who are seeking a conceptual foundation for applying causal methods across a range of substantive domains.

References

Aldrich J (1989). Autonomy. Oxford Economic Papers, 41(1): 15–34. https://doi.org/10.1093/oxfordjournals.oep.a041889

Angrist JD, Imbens GW, Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91(434): 444–455. https://doi.org/10.1080/01621459.1996.10476902

Arjovsky M, Bottou L, Gulrajani I, Lopez-Paz D (2020). Invariant risk minimization. arXiv preprint: https://arxiv.org/abs/1907.02893, Original version 2019.

Bajari P, Burdick B, Imbens GW, Masoero L, McQueen J, Richardson TS, et al. (2023). Experimental design in marketplaces. Statistical Science, 38(3): 458–476. https://doi.org/10.1214/23-STS883

Balke A, Pearl J (1997). Bounds on treatment effects from studies with imperfect compliance. Journal of the American Statistical Association, 92(439): 1171–1176. https://doi.org/10.1080/01621459.1997.10474074

Billingsley P (1995). Probability and Measure. Wiley, New York, 3rd edition.

Bühlmann P (2020). Invariance, causality and robustness. Statistical Science, 35(3): 404–426.

Cartwright N (2007). Hunting Causes and Using Them: Approaches in Philosophy and Economics. Cambridge University Press, Cambridge, UK.

Chen HY (2007). A semiparametric odds ratio model for measuring association. Biometrics, 63(2): 413–421. https://doi.org/10.1111/j.1541-0420.2006.00701.x

Cole SR, Frangakis CE (2009). The consistency statement in causal inference: A definition or an assumption? Epidemiology, 20(1): 3–5. https://doi.org/10.1097/EDE.0b013e31818ef366

Dawid AP (2000). Causal inference without counterfactuals. Journal of the American Statistical Association, 95(450): 407–424. https://doi.org/10.1080/01621459.2000.10474210

Dawid AP (2015). Statistical causality from a decision-theoretic perspective. Annual Review of Statistics and Its Application, 2(1): 273–303. https://doi.org/10.1146/annurev-statistics-010814-020105

Ding P, Geng Z, Yan W, Zhou X-H (2011). Identifiability and estimation of causal effects by principal stratification with outcomes truncated by death. Journal of the American Statistical Association, 106(496): 1578–1591. https://doi.org/10.1198/jasa.2011.tm10265

Dong M, Liu L, Tang D, Liu G, Xu W, Wang L (2025). Marginal causal effect estimation with continuous instrumental variables. arXiv preprint: https://arxiv.org/abs/2510.14368.

Druzdzel MJ, Simon HA (1993). Causality in Bayesian belief networks. In: Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI-93) (D Heckerman, E Mamdani, eds.), 3–11. Morgan Kaufmann, San Francisco, CA.

El Gamal A, Kim Y-H (2011). Network Information Theory. Cambridge University Press.

Geiger D, Verma T, Pearl J (1990). Identifying independence in Bayesian networks. Networks, 20(5): 507–534. https://doi.org/10.1002/net.3230200504

Greenland S (2004). An overview of methods for causal inference from observational studies. In: Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives: An Essential Journey with Donald Rubin’s Statistical Family (A Gelman, X-L Meng, eds.), pages 1–13. John Wiley & Sons, Chichester, UK.

Greenland S, Brumback B (2002). An overview of relations among causal modelling methods. International Journal of Epidemiology, 31(5): 1030–1037. https://doi.org/10.1093/ije/31.5.1030

Greenland S, Poole C (1988). Invariants and noninvariants in the concept of interdependent effects. Scandinavian Journal of Work, Environment & Health, 14(2): 125–129. https://doi.org/10.5271/sjweh.1945

Haavelmo T (1943). The statistical implications of a system of simultaneous equations. Econometrica, 11(1): 1–12. https://doi.org/10.2307/1905714

Halpern JY, Pearl J (2000). Axiomatizing causal reasoning. Journal of Artificial Intelligence Research, 12: 317–337. https://doi.org/10.1613/jair.648

Hartwig FP, Wang L, Smith GD, Davies NM (2023). Average causal effect estimation via instrumental variables: The no simultaneous heterogeneity assumption. Epidemiology, 34(3): 325–332. https://doi.org/10.1097/EDE.0000000000001596

Hernán MA, Robins JM (2016). Using big data to emulate a target trial when a randomized trial is not available. American Journal of Epidemiology, 183(8): 758–764. https://doi.org/10.1093/aje/kwv254

Hernán MA, Robins JM (2025). Causal Inference: What If. Chapman & Hall/CRC, Boca Raton, FL, 1st edition.

Hernán MA, Wang W, Leaf DE (2022). Target trial emulation: a framework for causal inference from observational data. Journal of the American Medical Association, 328(24): 2446–2447. https://doi.org/10.1001/jama.2022.21383

Hill AB (1965). The environment and disease: Association or causation? Proceedings of the Royal Society of Medicine, 58: 295–300. https://doi.org/10.1177/003591576505800503

Holland PW (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396): 945–960. https://doi.org/10.1080/01621459.1986.10478354

Hoyer P, Janzing D, Mooij JM, Peters J, Schölkopf B (2008). Nonlinear causal discovery with additive noise models. In: Advances in Neural Information Processing Systems (D Koller, D Schuurmans, Y Bengio, L Bottou, eds.), volume 21. Curran Associates, Inc.

Huang Y, Valtorta M (2006). Pearl’s calculus of intervention is complete. In: Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI-2006) (R Dechter, T Richardson, eds.), 217–224. AUAI Press, Arlington, VA.

Imai K, Keele L, Yamamoto T (2010). Identification, inference and sensitivity analysis for causal mediation effects. Statistical Science, 25(1): 51–71. https://doi.org/10.1214/10-STS321

Imbens GW (2014). Instrumental variables: An econometrician’s perspective. Statistical Science, 29(3): 323–358.

Imbens GW, Rubin DB (2015). Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press, New York.

Imbens GW, Wooldridge JM (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47(1): 5–86. https://doi.org/10.1257/jel.47.1.5

Janzing D, Schlölkopf B (2010). Causal inference using the algorithmic Markov condition. IEEE Transactions on Information Theory, 56(10): 5168–5194. https://doi.org/10.1109/TIT.2010.2060095

Jiao L, Wang Y, Liu X, Li L, Liu F, Ma W, et al. (2024). Causal inference meets deep learning: A comprehensive survey. Research, 7: 0467. https://doi.org/10.34133/research.0467

Jun SJ, Lee S (2023). Identifying the effect of persuasion. Journal of Political Economy, 131(8): 2032–2058. https://doi.org/10.1086/724114

Kuang K, Li L, Geng Z, Xu L, Zhang K, Liao B, et al. (2020). Causal inference. Engineering, 6: 253–263. https://doi.org/10.1016/j.eng.2019.08.016

Kusner MJ, Loftus J, Russell C, Silva R (2017). Counterfactual fairness. In: Advances in Neural Information Processing Systems 30 (I Guyon, U von Luxburg, S Bengio, H Wallach, R Fergus SVN Vishwanathan R Garnett, eds.), 4066–4076. Curran Associates, Inc. Fergus, Rob and Vishwanathan, S. V. N. and Garnett, Roman.

Laplace P (1825). Essai Philosophique sur les Probabilités. Bachelier, Paris, 5 edition. Originally published in 1814.

Lauritzen SL (1996). Graphical Models, volume 17. Clarendon Press.

Lauritzen SL (2004). Discussion on causality. Discussion of A. P. Dawid’s “Probability, causality, and the empirical world: A Bayes–de Finetti–Popper–Borel synthesis”. Scandinavian Journal of Statistics, 31(2): 189–192. https://doi.org/10.1111/j.1467-9469.2004.03-200A.x

Lauritzen SL, Dawid AP, Larsen BN, Leimer H-G (1990). Independence properties of directed Markov fields. Networks, 20(5): 491–505. https://doi.org/10.1002/net.3230200503

Lehmann EL, Casella G (1998). Theory of Point Estimation. Springer, New York, 2 edition.

Li F, Ding P, Mealli F (2023). Bayesian causal inference: A critical review. Philosophical Transactions of the Royal Society A, 381(2247): 20220153.

Malinsky D, Shpitser I, Richardson T (2019). A potential outcomes calculus for identifying conditional path-specific effects. In: Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (K Chaudhuri, M Sugiyama, eds.), volume 89 of Proceedings of Machine Learning Research, pages 3080–3088. PMLR.

Manski CF (1993). Identification problems in the social sciences. Sociological Methodology, 23: 1–56. https://doi.org/10.2307/271005

Morgan SL, Winship C (2014). Counterfactuals and Causal Inference: Methods and Principles for Social Research. Cambridge University Press, Cambridge, 2 edition.

Neison F (1844). On a method recently proposed for conducting inquiries into the comparative sanatory condition of various districts, with illustrations, derived from numerous places in Great Britain at the period of the last census. Journal of the Statistical Society of London, 7(1): 40–68. https://doi.org/10.2307/2337745

Neyman J (1923). On the application of probability theory to agricultural experiments. Essay on principles. Section 9. Statistical Science, 5(4): 465–472. Translated and edited by DM Dabrowska and TP Speed from the 1923 Polish original.

Pearl J (1985). Bayesian networks: A model of self-activated memory for evidential reasoning. In: Proceedings of the Seventh Conference of the Cognitive Science Society (DC Berger, ed.), 329–334. Lawrence Erlbaum Associates.

Pearl J (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA.

Pearl J (1995). Causal diagrams for empirical research. Biometrika, 82(4): 669–688. https://doi.org/10.1093/biomet/82.4.702

Pearl J (2001). Direct and indirect effects. In: Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI-2001) (JS Breese, D Koller, eds.), pages 411–420. Morgan Kaufmann, San Francisco, CA.

Pearl J (2009). Causality. Cambridge University Press, Cambridge, 2nd ed., first printing edition.

Pearl J (2010a). An introduction to causal inference. The International Journal of Biostatistics, 6(2): 7. https://doi.org/10.2202/1557-4679.1203

Pearl J (2010b). On the consistency rule in causal inference: Axiom, definition, assumption, or theorem? Epidemiology, 21(6): 872–875. https://doi.org/10.1097/EDE.0b013e3181f5d3fd

Pearl J (2019). The seven tools of causal inference, with reflections on machine learning. Communications of the ACM, 62(3): 54–60. https://doi.org/10.1145/3241036

Pearl J, Glymour M, Jewell NP (2016). Causal Inference in Statistics: A Primer. John Wiley & Sons.

Pearl J, Mackenzie D (2018). The Book of Why: The New Science of Cause and Effect. Basic Books, New York.

Pearl J, Paz A (1986). Graphoids: A graph-based logic for reasoning about relevance relations. In: Advances in Artificial Intelligence II (BD Boulay, DHD Warren, HE Kyburg, eds.), pages 357–363. North-Holland.

Perković E, Textor J, Kalisch M, Maathuis MH (2018). Complete graphical characterization and construction of adjustment sets in Markov equivalence classes of ancestral graphs. Journal of Machine Learning Research, 18(220): 1–62.

Peters J, Bühlmann P (2014). Identifiability of Gaussian structural equation models with equal error variances. Biometrika, 101(1): 219–228. https://doi.org/10.1093/biomet/ast043

Peters J, Bühlmann P, Meinshausen N (2016). Causal inference by using invariant prediction: Identification and confidence intervals. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 78(5): 947–1012. https://doi.org/10.1111/rssb.12167

Peters J, Janzing D, Schölkopf B (2017). Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press.

Peters J, Mooij JM, Janzing D, Schölkopf B (2014). Causal discovery with continuous additive noise models. Journal of Machine Learning Research, 15(1): 2009–2053.

Richardson TS (2003). Markov properties for acyclic directed mixed graphs. Scandinavian Journal of Statistics, 30(1): 145–157. https://doi.org/10.1111/1467-9469.00323

Richardson TS, Evans RJ, Robins JM, Shpitser I (2023). Nested Markov properties for acyclic directed mixed graphs. The Annals of Statistics, 51(1): 334–361. https://doi.org/10.1214/22-AOS2253

Richardson TS, Robins JM (2013). Single world intervention graphs (SWIGs): A unification of the counterfactual and graphical approaches to causality, Technical Report Working Paper 128, Center for Statistics and the Social Sciences, University of Washington.

Richardson TS, Robins JM (2014). ACE bounds; SEMs with equilibrium conditions. Statistical Science, 29(3): 363–366. https://doi.org/10.1214/14-STS485

Richardson TS, Robins JM (2023). Potential outcome and decision theoretic foundations for statistical causality. Journal of Causal Inference, 11(1): 20220012. https://doi.org/10.1515/jci-2022-0012

Richardson TS, Spirtes P (2002). Ancestral graph Markov models. The Annals of Statistics, 30(4): 962–1030. https://doi.org/10.1214/aos/1031689015

Robins J, Greenland S (1989). The probability of causation under a stochastic model for individual risk. Biometrics, 45(4): 1125–1138. https://doi.org/10.2307/2531765

Robins J, Greenland S (1991). Estimability and estimation of expected years of life lost due to a hazardous exposure. Statistics in Medicine, 10(1): 79–93. https://doi.org/10.1002/sim.4780100113

Robins JM (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling, 7(9–12): 1393–1512. https://doi.org/10.1016/0270-0255(86)90088-6

Robins JM (2003). Semantics of causal DAG models and the identification of direct and indirect effects. In: Highly Structured Stochastic Systems (PJ Green, NL Hjort, S Richardson, eds.), pages 70–81. Oxford University Press, Oxford.

Robins JM, Greenland S (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology, 3(2): 143–155. https://doi.org/10.1097/00001648-199203000-00013

Robins JM, Hernán MA, Siebert U (2004). Effects of multiple interventions. In: Comparative Quantification of Health Risks: Global and Regional Burden of Disease Attributable to Selected Major Risk Factors (M Ezzati, AD Lopez, A Rodgers, CJL Murray, eds.), pages 2191–2230. World Health Organization, Geneva.

Robins JM, Richardson TS (2011). Alternative graphical causal models and the identification of direct effects. In: Causality and Psychopathology: Finding the Determinants of Disorders and Their Cures (P Shrout, K Keyes, K Ornstein, eds.), chapter 6, pages 1–52. Oxford University Press.

Rosenbaum PR (2010). Design of Observational Studies. Springer Series in Statistics. Springer, New York.

Rothman KJ (1976). Causes. American Journal of Epidemiology, 104(6): 587–592. https://doi.org/10.1093/oxfordjournals.aje.a112335

Rothman KJ, Greenland S, Lash TL, et al. (2008). Modern Epidemiology, volume 3. Wolters Kluwer Health/Lippincott Williams & Wilkins, Philadelphia.

Roy AD (1951). Some thoughts on the distribution of earnings. Oxford Economic Papers, 3(2): 135–146. https://doi.org/10.1093/oxfordjournals.oep.a041827

Rubin DB (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5): 688–701. https://doi.org/10.1037/h0037350

Rubin DB (1980). Randomization analysis of experimental data: The Fisher randomization test comment. Journal of the American Statistical Association, 75(371): 591–593. https://doi.org/10.2307/2287653

Rubin DB (2004). Direct and indirect causal effects via potential outcomes. Scandinavian Journal of Statistics, 31(2): 161–170. https://doi.org/10.1111/j.1467-9469.2004.02-123.x

Rubin DB (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469): 322–331. https://doi.org/10.1198/016214504000001880

Schölkopf B (2022). Causality for machine learning. In: Probabilistic and Causal Inference: The Works of Judea Pearl (H Geffner, R Dechter, JY Halpern, eds.), pages 765–804. ACM Books. Association for Computing Machinery, New York, NY.

Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A (2006). A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(72): 2003–2030.

Shpitser I, Pearl J (2006). Identification of joint interventional distributions in recursive semi-Markovian causal models. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06) (A Cohn, ed.), 1219–1226. AAAI Press.

Shpitser I, Pearl J (2007). What counterfactuals can be tested. In: Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI-2007) (R Parr, L van der Gaag, eds.), 352–359. AUAI Press.

Shpitser I, Pearl J (2008). Complete identification methods for the causal hierarchy. Journal of Machine Learning Research, 9: 1941–1979.

Shpitser I, Richardson TS, Robins JM (2022). Multivariate counterfactual systems and causal graphical models. In: Probabilistic and Causal Inference: The Works of Judea Pearl (H Geffner, R Dechter, JY Halpern, eds.), 813–852. ACM Books, New York.

Simon HA (1953). Causal Ordering and Identifiability Studies in Econometric Method. Wiley, New York.

Spirtes P (2010). Introduction to causal inference. Journal of Machine Learning Research, 11: 1643–1662.

Spirtes P, Glymour CN, Scheines R (2000). Causation, Prediction, and Search. MIT Press, Cambridge, MA, 2nd edition.

Strotz RH, Wold HOA (1960). Recursive and nonrecursive systems of equations. Econometrica, 28(2): 417–427. https://doi.org/10.2307/1907731

Tchetgen Tchetgen EJ, Wang L, Sun B (2018). Discrete choice models for nonmonotone nonignorable missing data: Identification and inference. Statistica Sinica, 28(4): 2069. https://doi.org/10.5705/ss.202016.0325

Tian J, Pearl J (2002). A general identification condition for causal effects. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI-02) (A Darwiche, N Friedman, eds.), 567–573. AAAI Press.

Tian J, Shpitser I (2010). On identifying causal effects. In: Heuristics, Probability and Causality: A Tribute to Judea Pearl (R Dechter, H Geffner, J Halpern, eds.), pages 415–444. College Publications, UK.

Tjoa E, Guan C (2021). A survey on explainable artificial intelligence (XAI): Toward medical XAI. IEEE Transactions on Neural Networks and Learning Systems, 32(11): 4793–4813. https://doi.org/10.1109/TNNLS.2020.3027314

Uhler C, Raskutti G, Bühlmann P, Yu B (2013). Geometry of the faithfulness assumption in causal inference. The Annals of Statistics, 41(2): 436–463. https://doi.org/10.1214/12-AOS1080

VanderWeele T (2015). Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press.

VanderWeele TJ (2009). Concerning the consistency assumption in causal inference. Epidemiology, 20(6): 880–883. https://doi.org/10.1097/EDE.0b013e3181bd5638

VanderWeele TJ, Richardson TS (2012). General theory for interactions in sufficient cause models with dichotomous exposures. The Annals of Statistics, 40(4): 2128. https://doi.org/10.1214/12-AOS1019

VanderWeele TJ, Robins JM (2009). Minimal sufficient causation and directed acyclic graphs. The Annals of Statistics, 37(3): 1437–1465. https://doi.org/10.1214/08-AOS613

Verma TS, Pearl J (1991). Equivalence and synthesis of causal models. In: Proceedings of the Sixth Conference on Uncertainty in Artificial Intelligence (UAI-91) (P Bonissone, B D’Ambrosio, P Smets, eds.), 220–227. Morgan Kaufmann.

Wachter S, Mittelstadt B, Russell C (2017). Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harvard Journal of Law & Technology, 31(2): 841–887.

Wang L (2022). On the homogeneity of measures for binary associations. arXiv preprint: https://arxiv.org/abs/2210.05179.

Wang L, Richardson TS, Zhou X-H (2017a). Causal analysis of ordinal treatments and binary outcomes under truncation by death. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 79(3): 719–735. https://doi.org/10.1111/rssb.12188

Wang L, Tchetgen Tchetgen EJ (2018). Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variables. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 80(3): 531–550. https://doi.org/10.1111/rssb.12262

Wang L, Zhou X-H, Richardson TS (2017b). Identification and estimation of causal effects with outcomes truncated by death. Biometrika, 104(3): 597–612. https://doi.org/10.1093/biomet/asx034

Wright S (1921). Correlation and causation. Journal of Agricultural Research, 20: 557–585.

Wu P, Wang L (2026). Position: A potential outcomes perspective on Pearl’s causal hierarchy. arXiv preprint: https://arxiv.org/abs/2601.20405

Yang S, Wang L, Ding P (2019). Causal inference with confounders missing not at random. Biometrika, 106(4): 875–888. https://doi.org/10.1093/biomet/asz048

Yao L, Chu Z, Li S, Li Y, Gao J, Zhang A (2021). A survey on causal inference. ACM Transactions on Knowledge Discovery from Data, 15(5): 1–46. https://doi.org/10.1145/3444944

Zhang K, Hyvärinen A (2009). On the identifiability of the post-nonlinear causal model. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-2009) (J Bilmes, A Ng, eds.), 647–655. AUAI Press.

Zhou Y, Tang D, Kong D, Wang L (2024). Promises of parallel outcomes. Biometrika, 111(2): 537–550. https://doi.org/10.1093/biomet/asae008

2026 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

directed acyclic graphs identification potential outcomes structural equation models SWIGs

Metrics

since February 2021

2306

Article info
views

861

PDF
downloads

RSS

Authors

Abstract

References

Export citation

Copy and paste formatted citation

Download citation in file