Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 23, Issue 1 (2025)
  4. A Joint Equivalence and Difference (JED) ...

Journal of Data Science

Submit your article Information
  • Article info
  • More
    Article info

A Joint Equivalence and Difference (JED) Test for Practical Use in Controlled Trials
Volume 23, Issue 1 (2025), pp. 171–187
Robert H. Riffenburgh ORCID icon link to view author Robert H. Riffenburgh details   Lingge Wang  

Authors

 
Placeholder
https://doi.org/10.6339/24-JDS1142
Pub. online: 2 July 2024      Type: Statistical Data Science      Open accessOpen Access

Received
5 February 2024
Accepted
8 June 2024
Published
2 July 2024

Abstract

A joint equivalence and difference (JED) test is needed because difference tests and equivalence (more exactly, similarity) tests each provide only a one-sided answer. The concept and underlying theory have appeared numerous times, noted and discussed here, but never in a form usable in workaday statistical applications. This work provides such a form as a straightforward simple test with a step-by-step guide and possible interpretations and formulas. For initial treatment, it restricts attention to a t test of two means. The guide is illustrated by a numerical example from the field of orthopedics. To assess the quality of the JED test, its sensitivity and specificity are examined for test outcomes depending on error risk α, total sample size, sub-sample size ratio, and variability ratio. These results are shown in tables. Interpretations are discussed. It is concluded that the test exhibits high power and effect size and that only quite small samples show any effect on the power or effect size of the JED test by commonly seen values of any of the parameters. Data for the example and computer codes for using the JED test are accessible through links to supplementary material. We recommend that this work be extended to other test forms and multivariate forms.

Supplementary material

 Supplementary Material
The dataset used in numerical example (Section 3) and R code for tables (Section 4) can be found at: https://github.com/wlingge/JED

References

 
Allen IE, Seaman CA (2006). Different, equivalent or both? Quality Progress, 39(7): 77.
 
Bauer P, Kieser M (1996). A unifying approach for confidence intervals and testing of equivalence and difference. Biometrika, 83(4): 934–937. https://doi.org/10.1093/biomet/83.4.934
 
Berger RL (1982). Multiparameter hypothesis testing and acceptance sampling. Technometrics, 24(4): 295–300. https://doi.org/10.2307/1267823
 
Betensky RA (2019). The p-value requires context, not a threshold. American Statistician, 73(sup1): 115–117. https://doi.org/10.1080/00031305.2018.1529624
 
Bloch DA, Lai TL, Tubert-Bitter P (2001). One-sided tests in clinical trials with multiple endpoints. Biometrics, 57(4): 1039–1047. https://doi.org/10.1111/j.0006-341X.2001.01039.x
 
Bofinger E (1985). Expanded confidence intervals. Communications in Statistics - Theory and Methods, 14(8): 1849–1864. https://doi.org/10.1080/03610928508829017
 
Bofinger E (1992). Expanded confidence intervals, one-sided tests, and equivalence testing. Journal of Biopharmaceutical Statistics, 2(2): 181–188. https://doi.org/10.1080/10543409208835038
 
Christensen E (2007). Methodology of superiority vs. equivalence trials and non-inferiority trials. Journal of Hepatology, 46(5): 947–954. https://doi.org/10.1016/j.jhep.2007.02.015
 
Cohen J (1998). Statistical Power Analysis for the Behavioral Sciences. Routledge, New York.
 
Cornfield J, Haenszel W, Hammond EC, Lilienfeld AM, Shimkin MB, Wynder EL (1959). Smoking and lung cancer: Recent evidence and a discussion of some questions. Journal of the National Cancer Institute, 22(1): 173–203.
 
Da Silva GT, Logan BR, Klein JP (2009). Methods for equivalence and noninferiority testing. Biology of Blood and Marrow Transplantation, 15(1): 120–127. https://doi.org/10.1016/j.bbmt.2008.10.004
 
Gastwirth JL (1992). Methods for assessing the sensitivity of statistical comparisons used in title VII cases to omitted variables. Jurimetrics Journal, 33: 19.
 
Goeman JJ, Solari A, Stijnen T (2010). Three-sided hypothesis testing: Simultaneous testing of superiority, equivalence and inferiority. Statistics in Medicine, 29(20): 2117–2125. https://doi.org/10.1002/sim.4002
 
Hirotsu C (2007). A unifying approach to non-inferiority, equivalence and superiority tests via multiple decision processes. Pharmaceutical Statistics: The Journal of Applied Statistics in the Pharmaceutical Industry, 6(3): 193–203. https://doi.org/10.1002/pst.305
 
Hsu JC, Hwang JG, Liu HK, Ruberg SJ (1994). Confidence intervals associated with tests for bioequivalence. Biometrika, 81(1): 103–114. https://doi.org/10.1093/biomet/81.1.103
 
Mascha EJ (2010). Equivalence and noninferiority testing in anesthesiology research. The Journal of the American Society of Anesthesiologists, 113(4): 779–781.
 
Matthews RA (2019). Moving towards the post p< 0.05 era via the analysis of credibility. American Statistician, 73(sup1): 202–212. https://doi.org/10.1080/00031305.2018.1543136
 
Morikawa T, Yoshida M (1995). A useful testing strategy in phase III trials: Combined test of superiority and test of equivalence. Journal of Biopharmaceutical Statistics, 5(3): 297–306. https://doi.org/10.1080/10543409508835115
 
Öhrn F, Jennison C (2010). Optimal group-sequential designs for simultaneous testing of superiority and non-inferiority. Statistics in Medicine, 29(7–8): 743–759. https://doi.org/10.1002/sim.3790
 
Perlman MD (1969). One-sided testing problems in multivariate analysis. The Annals of Mathematical Statistics, 40(2): 549–567. https://doi.org/10.1214/aoms/1177697723
 
Perlman MD, Wu L (2004). A note on one-sided tests with multiple endpoints. Biometrics, 60(1): 276–280. https://doi.org/10.1111/j.0006-341X.2004.00159.x
 
Riffenburgh RH (2006). A Comparison of Two Fractured-ankle Pinning Devices. Unpublished process improvement data, Naval Medical Center San Diego. Personal data, collection of R. H. Riffenburgh.
 
Riffenburgh RH, Gillen DL (2020). Statistics in Medicine, 4th edition. Elsevier, Amsterdam.
 
Rosenbaum PR, Rubin DB (1983). Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society, Series B, Methodological, 45(2): 212–218. https://doi.org/10.1111/j.2517-6161.1983.tb01242.x
 
Rosenbaum PR, Silber JH (2009). Sensitivity analysis for equivalence and difference in an observational study of neonatal intensive care units. Journal of the American Statistical Association, 104(486): 501–511. https://doi.org/10.1198/jasa.2009.0016
 
Roy SN (1953). On a heuristic method of test construction and its use in multivariate analysis. The Annals of Mathematical Statistics, 24(2): 220–238. https://doi.org/10.1214/aoms/1177729029
 
Satterthwaite FE (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2(6): 110–114. https://doi.org/10.2307/3002019
 
Serdar CC, Cihan M, Yücel D, Serdar MA (2021). Sample size, power and effect size revisited: Simplified and practical approaches in pre-clinical, clinical and laboratory studies. Biochemia Medica, 31(1): 27–53. https://doi.org/10.11613/BM.2021.010502
 
Student (1908). The probable error of a mean. Biometrika, 6(1): 1–25. https://doi.org/10.2307/2331554
 
Tamhane AC, Logan BR (2004). A superiority-equivalence approach to one-sided tests on multiple endpoints in clinical trials. Biometrika, 91(3): 715–727. https://doi.org/10.1093/biomet/91.3.715
 
Tryon WW (2001). Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: An integrated alternative method of conducting null hypothesis statistical tests. Psychological Methods, 6(4): 371. https://doi.org/10.1037/1082-989X.6.4.371
 
Tryon WW, Lewis C (2008). An inferential confidence interval method of establishing statistical equivalence that corrects Tryon’s (2001) reduction factor. Psychological Methods, 13(3): 272–277. https://doi.org/10.1037/a0013158
 
Wald A (1945). Sequential method of sampling for deciding between two courses of action. Journal of the American Statistical Association, 40(231): 277–306. https://doi.org/10.1080/01621459.1945.10500736
 
Waldhoer T, Heinzl H (2011). Combining difference and equivalence test results in spatial maps. International Journal of Health Geographics, 10: 1–10. https://doi.org/10.1186/1476-072X-10-1
 
Welch BL (1947). The generalization of ‘student’s’ problem when several different population variances are involved. Biometrika, 34(1–2): 28–35. https://doi.org/10.1093/biomet/34.1-2.28

PDF XML
PDF XML

Copyright
2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
decision-making error rate estimation means testing medical decisions statistical testing

Metrics
since February 2021
205

Article info
views

93

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy