Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 24, Issue 1 (2026): Special Issue: Statistical aspects of Trustworthy Machine Learning
  4. The Typicality Principle and Its Implica ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

The Typicality Principle and Its Implications for Statistics and Data Science
Volume 24, Issue 1 (2026): Special Issue: Statistical aspects of Trustworthy Machine Learning, pp. 4–25
Yiran Jiang   Zeyu Zhang   Ryan Martin     All authors (4)

Authors

 
Placeholder
https://doi.org/10.6339/26-JDS1217
Pub. online: 26 January 2026      Type: Philosophies Of Data Science      Open accessOpen Access

Received
4 January 2026
Accepted
15 January 2026
Published
26 January 2026

Abstract

A central focus of data science is the transformation of empirical evidence into knowledge. By “knowledge,” we mean claims that are (i) supported by data through an explicit inferential procedure and (ii) accompanied by calibrated measures of uncertainty. As such, the scientific insights and attitudes of deep thinkers like Ronald A. Fisher, Karl R. Popper, and John W. Tukey are expected to inspire exciting new advances in machine learning and artificial intelligence in years to come. Along these lines, the present paper advances a novel typicality principle which states, roughly, that if the observed data is sufficiently “atypical” in a certain sense relative to a posited theory, then that theory is unwarranted. This emphasis on typicality brings familiar but often overlooked background notions like model-checking to the inferential foreground. One instantiation of the typicality principle is in the context of parameter estimation, where we propose a new typicality-based regularization strategy that leans heavily on goodness-of-fit testing. The effectiveness of this new regularization strategy is illustrated in three non-trivial examples where ordinary maximum likelihood estimation fails miserably. We also demonstrate how the typicality principle fits within a bigger picture of reliable and efficient uncertainty quantification.

Supplementary material

 Supplementary Material
Code to reproduce all figures in this paper is included in the supplementary materials.

References

 
Aldrich J (1997). R. A. Fisher and the making of maximum likelihood 1912–1922. Statistical Science, 12(3): 162–176. https://doi.org/10.1214/ss/1030037906
 
Basu D (1975). Statistical information and likelihood. Sankhyā: The Indian. Journal of Statistics, Series A, 37(1): 1–71. Discussion and correspondance between Barnard and Basu. https://doi.org/10.1111/j.2517-6161.1975.tb01024.x
 
Berger J, Bernardo J, Sun D (2024). Objective Bayesian Inference. World Scientific Publishing Co.
 
Berger JO, Wolpert RL (1984). The Likelihood Principle, volume 6 of Institute of Mathematical Statistics Lecture Notes—Monograph Series. Institute of Mathematical Statistics, Hayward, CA.
 
Birnbaum A (1962). On the foundations of statistical inference. Journal of the American Statistical Association, 57: 269–326. https://doi.org/10.1080/01621459.1962.10480660
 
Breiman L (2001). Statistical modeling: The two cultures. Statistical Science, 16(3): 199–231. https://doi.org/10.1214/ss/1009213726
 
Carnap R (1962). Logical Foundations of Probability. [2]Second edition. The University of Chicago Press, Chicago, Ill.
 
Cella L, Martin R (2023). Possibility-theoretic statistical inference offers performance and probativeness assurances. International Journal of Approximate Reasoning, 163: 109060. https://doi.org/10.1016/j.ijar.2023.109060
 
Datta GS, Ghosh JK (1995). On priors providing frequentist validity for Bayesian inference. Biometrika, 82(1): 37–45. https://doi.org/10.2307/2337625
 
Dempster AP (1966). New methods for reasoning towards posterior distributions based on sample data. The Annals of Mathematical Statistics, 37: 355–374. https://doi.org/10.1214/aoms/1177699517
 
Dempster AP (2002). John W. Tukey as “philosopher”. The Annals of Statistics, 30(6): 1619–1628. In memory of John W. Tukey.
 
Dempster AP (2008). The Dempster–Shafer calculus for statisticians. International Journal of Approximate Reasoning, 48(2): 365–377. https://doi.org/10.1016/j.ijar.2007.03.004
 
Deuschel J, Foltyn A, Roscher K, Scheele S (2024). The role of uncertainty quantification for trustworthy AI. In: Unlocking Artificial Intelligence (C Mutschler, C Kunzmann, N Ullmann, A Martin, eds.), 95–115. Springer, Cham. Open access.
 
Donoho D (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4): 745–766. https://doi.org/10.1080/10618600.2017.1384734
 
Dubois D, Foulloy L, Mauris G, Prade H (2004). Probability-possibility transformations, triangular fuzzy sets, and probabilistic inequalities. Reliable Computing, 10(4): 273–297. https://doi.org/10.1023/B:REOM.0000032115.22510.b5
 
Dubois D, Prade H (1988). Possibility Theory. Plenum Press, New York.
 
Durbin J (1970). On Birnbaum’s theorem on the relation between sufficiency, conditionality and likelihood. Journal of the American Statistical Association, 65(329): 395–398. https://doi.org/10.1080/01621459.1970.10481088
 
Edwards AWF (1992). Likelihood. Johns Hopkins University Press, Baltimore, MD, expanded edition. Revised reprint of the 1972 original.
 
Eschker SJ, Liu C (2024). Towards strong AI: Transformational beliefs and scientific creativity. arXiv preprint: https://arxiv.org/abs/2412.19938
 
Evans M (2013). What does the proof of Birnbaum’s theorem prove? Electronic Journal of Statistics, 7: 2645–2655.
 
Fisher RA (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London. Series A, 222: 309–368. https://doi.org/10.1098/rsta.1922.0009
 
Fisher RA (1925). Theory of statistical estimation. Mathematical Proceedings of the Cambridge Philosophical Society, 22: 200–225. https://doi.org/10.1017/S0305004100009580
 
Fisher RA (1933). The concepts of inverse probability and fiducial probability referring to unknown parameters. Proceedings of the Royal Society of London. Series A, 139: 343–348.
 
Fisher RA (1935a). The fiducial argument in statistical inference. Annals of Eugenics, 6: 391–398. https://doi.org/10.1111/j.1469-1809.1935.tb02120.x
 
Fisher RA (1935b). The logic of inductive inference. Journal of the Royal Statistical Society, 98: 39–82. https://doi.org/10.2307/2342435
 
Fraser DAS (1968). The Structure of Inference. John Wiley & Sons Inc., New York.
 
Fraser DAS, Reid N, Lin W (2018). When should modes of inference disagree? Some simple but challenging examples. The Annals of Applied Statistics, 12(2): 750–770. https://doi.org/10.1214/18-AOAS1160SF
 
Hacking I (1976). Logic of Statistical Inference. Cambridge University Press, Cambridge-New York-Melbourne.
 
Hannig J, Iyer H, Lai RCS, Lee TCM (2016). Generalized fiducial inference: A review and new results. Journal of the American Statistical Association, 111(515): 1346–1361. https://doi.org/10.1080/01621459.2016.1165102
 
Hinton G, Vinyals O, Dean J (2015). Distilling the knowledge in a neural network. arXiv preprint: https://arxiv.org/abs/1503.02531
 
Hose D (2022). Possibilistic Reasoning with Imprecise Probabilities: Statistical Inference and Dynamic Filtering, Ph.D. thesis, University of Stuttgart. https://dominikhose.github.io/dissertation/diss_dhose.pdf
 
Jaynes ET (2003). Probability Theory. Cambridge University Press, Cambridge.
 
Jeffreys H (1946). An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London. Series A, 186: 453–461.
 
Jeffreys H (1998). Theory of Probability. Oxford Classic Texts in the Physical Sciences. The Clarendon Press, Oxford University Press, New York. Reprint of the 1983 edition.
 
Jiang Y, Liu C (2025). Estimation of over-parameterized models from an auto-modeling perspective. Journal of the American Statistical Association, 120(552): 2422–2434. https://doi.org/10.1080/01621459.2025.2455192
 
Jiang Y, Liu C, Zhang H (2023). Finite sample valid inference via calibrated bootstrap. https://arxiv.org/abs/2408.16763
 
Kyburg HE Jr (1987). Bayesian and non-Bayesian evidential updating. Artificial Intelligence, 31(3): 271–293. https://doi.org/10.1016/0004-3702(87)90068-3
 
Le Cam L (1986). Asymptotic Methods in Statistical Decision Theory Springer Series in Statistics. Springer-Verlag, New York.
 
Le Cam L (1990). Maximum likelihood: An introduction. International Statistical Review, 58(2): 153–171. https://doi.org/10.2307/1403464
 
Lehmann E (1983). Theory of Point Estimation. Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, California.
 
Liu C (2023). Reweighted and circularised Anderson-Darling tests of goodness-of-fit. Journal of Nonparametric Statistics, 35(4): 869–904. https://doi.org/10.1080/10485252.2023.2213782
 
Liu Y, Yao Y, Ton JF, Zhang X, Guo R, Cheng H, et al. (2024). Trustworthy LLMs: A survey and guideline for evaluating large language models’ alignment. https://arxiv.org/abs/2308.05374
 
Martin R (2018). On an inferential model construction using generalized associations. Journal of Statistical Planning and Inference, 195: 105–115. https://doi.org/10.1016/j.jspi.2016.11.006
 
Martin R (2022a). Valid and efficient imprecise-probabilistic inference with partial priors, I. First results. https://arxiv.org/abs/2203.06703
 
Martin R (2022b). Valid and efficient imprecise-probabilistic inference with partial priors, II. General framework. https://arxiv.org/abs/2211.14567
 
Martin R (2023). Valid and efficient imprecise-probabilistic inference with partial priors. III. Marginalization. https://arxiv.org/abs/2309.13454
 
Martin R (2024). A possibility-theoretic solution to Basu’s Bayesian–frequentist via media. Sankhya A, 86: 43–70. https://doi.org/10.1007/s13171-023-00323-9
 
Martin R (2025a). A new Monte Carlo method for valid prior-free possibilistic statistical inference. https://arxiv.org/abs/2501.10585
 
Martin R (2025b). Possibilistic inferential models: a review. Journal of the American Statistical Association. To appear: https://arxiv.org/abs/2507.09007
 
Martin R, Liu C (2013). Inferential models: A framework for prior-free posterior probabilistic inference. Journal of the American Statistical Association, 108(501): 301–313. https://doi.org/10.1080/01621459.2012.747960
 
Martin R, Liu C (2014). Discussion: Foundations of statistical inference, revisited. Statistical Science, 29: 247–251.
 
Martin R, Liu C (2015). Inferential Models, volume 147 of Monographs on Statistics and Applied Probability. CRC Press, Boca Raton, FL.
 
Martin R, Prim SN, Williams J (2025). Decision-making with possibilistic inferential models. https://arxiv.org/abs/2112.13247
 
Mayo D (2014). On the Birnbaum argument for the strong likelihood principle. Statistical Science, 29(2): 227–239. https://doi.org/10.1214/13-STS457
 
Molchanov I (2005). Theory of Random Sets. Probability and Its Applications, (New York). Springer-Verlag London Ltd., London.
 
Neyman J, Scott EL (1948). Consistent estimates based on partially consistent observations. Econometrica, 16: 1–32. https://doi.org/10.2307/1914288
 
Pardo L (2018). Statistical Inference Based on Divergence Measures. Chapman and Hall/CRC.
 
Popper KR (1959a). The Logic of Scientific Discovery. Basic Books, New York.
 
Popper KR (1959b). The Logic of Scientific Discovery. Hutchinson and Co., Ltd., London.
 
Shafer G (1976). A Mathematical Theory of Evidence. Princeton University Press, Princeton, N.J.
 
Shafer G (1982). Belief functions and parametric models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 44(3): 322–352. With discussion. https://doi.org/10.1111/j.2517-6161.1982.tb01211.x
 
Stein C (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955, vol. I, Neyman J (Ed.), 197–206. University of California Press, Berkeley and Los Angeles.
 
Stein C (1959). An example of wide discrepancy between fiducial and confidence intervals. The Annals of Mathematical Statistics, 30: 877–880. https://doi.org/10.1214/aoms/1177706072
 
Stigler SM (2007). The epic story of maximum likelihood. Statistical Science, 22(4): 598–620. https://doi.org/10.1214/07-STS249
 
Tibshirani R (1989). Noninformative priors for one parameter of many. Biometrika, 76(3): 604–608. https://doi.org/10.1093/biomet/76.3.604
 
Troffaes MCM, de Cooman G (2014). Lower Previsions Wiley Series in Probability and Statistics. John Wiley & Sons, Ltd., Chichester.
 
Tukey JW (1962). The future of data analysis. In: Breakthroughs in Statistics: Methodology and Distribution, Kotz S, Johnson N.L. (Eds.), 408–452. Springer.
 
Tukey JW (1977). Exploratory Data Analysis. Pearson.
 
Tukey JW (1986). The Collected Works of John W. Tukey. Vol. III. The Wadsworth & Brooks/Cole Statistics/Probability Series. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, CA. Philosophy and principles of data analysis: 1949–1964, Edited and with comments by Lyle V. Jones, With a biography of Tukey by Frederick Mosteller.
 
van der Vaart A (2002). The statistical work of Lucien Le Cam. The Annals of Statistics, 30(3): 631–682. Dedicated to the memory of Lucien Le Cam.
 
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. (2017). Attention is all you need. In: Advances in Neural Information Processing Systems (I Guyon, UV Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, R Garnett, eds.), volume 30 of Curran Associates. Inc.
 
Xie M, Singh K (2013). Confidence distribution, the frequentist distribution estimator of a parameter: A review. International Statistical Review, 81(1): 3–39. https://doi.org/10.1111/insr.12000
 
Zabell SL (1992). R. A. Fisher and the fiducial argument. Statistical Science, 7(3): 369–387. https://doi.org/10.1214/ss/1177011233

Related articles PDF XML
Related articles PDF XML

Copyright
2026 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
falsification goodness-of-fit inferential model likelihood model-checking regularization uncertainty quantification

Funding
Liu and Zhang are supported by the U.S. National Science Foundation grant DMS-2412629. Martin is supported by the U.S. National Science Foundation grant DMS–2412628.

Metrics
since February 2021
196

Article info
views

90

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy