<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1213</article-id>
<article-id pub-id-type="doi">10.6339/25-JDS1213</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Statistical Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>A Bayesian Approach to Pre-Post Comparison of Inter-Rater Agreement in Ordinal Ratings</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Berry</surname><given-names>Aiden</given-names></name><xref ref-type="aff" rid="j_jds1213_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Cao</surname><given-names>Jennifer</given-names></name><xref ref-type="aff" rid="j_jds1213_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname><given-names>Song</given-names></name><xref ref-type="aff" rid="j_jds1213_aff_003">3</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1213_aff_001"><label>1</label>Department of Statistics and Data Science, <institution>Southern Methodist University</institution>, Dallas, TX, <country>USA</country></aff>
<aff id="j_jds1213_aff_002"><label>2</label>Department of Ophthalmology, <institution>University of Texas Southwestern Medical Center</institution>, Dallas, TX, <country>USA</country></aff>
<aff id="j_jds1213_aff_003"><label>3</label>Department of Health Data Science and Biostatistics, <institution>University of Texas Southwestern Medical Center</institution>, Dallas, TX, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:song.zhang@utsouthwestern.edu">song.zhang@utsouthwestern.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2025</year></pub-date><pub-date pub-type="epub"><day>16</day><month>12</month><year>2025</year></pub-date><volume content-type="ahead-of-print">0</volume><issue>0</issue><fpage>1</fpage><lpage>14</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1213_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>The supplementary material includes supplementary tables and R codes.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>2</day><month>9</month><year>2025</year></date><date date-type="accepted"><day>8</day><month>12</month><year>2025</year></date></history>
<permissions><copyright-statement>2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2025</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Inter-rater agreement is fundamental to decision making in medicine, psychology, and the social sciences, as it reflects the quality and reliability of rating systems. ICC (intraclass correlation) has been widely used as a measure of inter-rater agreement. To date, there has been no methodological development that properly assesses improvement in ICC for pre–post studies with ordinal ratings. It remain uninvestigated whether/how correlations between pre- and post-intervention scores impact the estimation and comparison of ICC. We present a Bayesian hierarchical probit framework for evaluating changes in ICCs in such settings. The model incorporates rater- and item-level correlations and compares two parameterizations: an “individual components” prior that separately models variances and correlations, and an inverse Wishart prior. Simulation studies show that accounting for pre–post correlation substantially improves estimation accuracy and power to detect changes in agreement, while ignoring it reduces efficiency. Application to a multicenter study on conjunctival inflammation demonstrates that a novel grading scale markedly increased inter-rater agreement. This framework underscores the importance of modeling ordinal outcomes appropriately and provides a flexible Bayesian tool for evaluating the effectiveness of interventions on inter-rater agreement in pre-post studies.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>Bayesian</kwd>
<kwd>inter-rater agreement</kwd>
<kwd>intraclass correlation</kwd>
<kwd>ordinal</kwd>
<kwd>pre-post design</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1213_reflist_001">
<title>References</title>
<ref id="j_jds1213_ref_001">
<mixed-citation publication-type="book"> <string-name><surname>Ahn</surname> <given-names>C</given-names></string-name>, <string-name><surname>Heo</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>S</given-names></string-name> (<year>2014</year>). <source><italic>Sample Size Calculations for Clustered and Longitudinal Outcomes in Clinical Research</italic></source>. <publisher-name>CRC Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1213_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Albert</surname> <given-names>JH</given-names></string-name>, <string-name><surname>Chib</surname> <given-names>S</given-names></string-name> (<year>1993</year>). <article-title>Bayesian analysis of binary and polychotomous response data</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>88</volume>(<issue>422</issue>): <fpage>669</fpage>–<lpage>679</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.1993.10476321" xlink:type="simple">https://doi.org/10.1080/01621459.1993.10476321</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Atenafu</surname> <given-names>EG</given-names></string-name>, <string-name><surname>Hamid</surname> <given-names>JS</given-names></string-name>, <string-name><surname>To</surname> <given-names>T</given-names></string-name>, <string-name><surname>Willan</surname> <given-names>AR</given-names></string-name>, <string-name><surname>M Feldman</surname> <given-names>B</given-names></string-name>, <string-name><surname>Beyene</surname> <given-names>J</given-names></string-name> (<year>2012</year>). <article-title>Bias-corrected estimator for intraclass correlation coefficient in the balanced one-way random effects model</article-title>. <source><italic>BMC Medical Research Methodology</italic></source>, <volume>12</volume>(<issue>126</issue>): <fpage>1</fpage>–<lpage>8</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1186/1471-2288-12-126" xlink:type="simple">https://doi.org/10.1186/1471-2288-12-126</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Calle-Alonso</surname> <given-names>F</given-names></string-name>, <string-name><surname>Perez Sanchez</surname> <given-names>CJ</given-names></string-name> (<year>2015</year>). <article-title>A Monte Carlo–based Bayesian approach for measuring agreement in a qualitative scale</article-title>. <source><italic>Applied Psychological Measurement</italic></source>, <volume>39</volume>(<issue>3</issue>): <fpage>189</fpage>–<lpage>207</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/0146621614554080" xlink:type="simple">https://doi.org/10.1177/0146621614554080</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Cohen</surname> <given-names>J</given-names></string-name> (<year>1960</year>). <article-title>A coefficient of agreement for nominal scales</article-title>. <source><italic>Educational and Psychological Measurement</italic></source>, <volume>20</volume>(<issue>1</issue>): <fpage>37</fpage>–<lpage>46</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/001316446002000104" xlink:type="simple">https://doi.org/10.1177/001316446002000104</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_006">
<mixed-citation publication-type="journal"> <string-name><surname>Cohen</surname> <given-names>J</given-names></string-name> (<year>1968</year>). <article-title>Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit</article-title>. <source><italic>Psychological Bulletin</italic></source>, <volume>70</volume>(<issue>4</issue>): <fpage>213</fpage>–<lpage>220</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1037/h0026256" xlink:type="simple">https://doi.org/10.1037/h0026256</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_007">
<mixed-citation publication-type="journal"> <string-name><surname>Eziama</surname> <given-names>E</given-names></string-name>, <string-name><surname>Nguyen</surname> <given-names>C</given-names></string-name>, <string-name><surname>Foster</surname> <given-names>CS</given-names></string-name>, <string-name><surname>Heydinger</surname> <given-names>S</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>JH</given-names></string-name> (<year>2025</year>). <article-title>Novel grading scale for conjunctival inflammation in cicatrizing conjunctivitis associated with pemphigoid</article-title>. <source><italic>Ocular Immunology and Inflammation</italic></source>, <volume>33</volume>(<issue>4</issue>): <fpage>649</fpage>–<lpage>653</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/09273948.2024.2434128" xlink:type="simple">https://doi.org/10.1080/09273948.2024.2434128</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Fanshawe</surname> <given-names>TR</given-names></string-name>, <string-name><surname>Lynch</surname> <given-names>AG</given-names></string-name>, <string-name><surname>Ellis</surname> <given-names>IO</given-names></string-name>, <string-name><surname>Green</surname> <given-names>AR</given-names></string-name>, <string-name><surname>Hanka</surname> <given-names>R</given-names></string-name> (<year>2008</year>). <article-title>Assessing agreement between multiple raters with missing rating information, applied to breast cancer tumour grading</article-title>. <source><italic>PLoS ONE</italic></source>, <volume>3</volume>(<issue>8</issue>): <fpage>e2925</fpage>–<lpage>e2936</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1371/journal.pone.0002925" xlink:type="simple">https://doi.org/10.1371/journal.pone.0002925</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>Fisher</surname> <given-names>RA</given-names></string-name> (<year>1921</year>). <article-title>On the “probable error”’ of a coefficient of correlation deduced from a small sample</article-title>. <source><italic>Metron</italic></source>, <volume>1</volume>: <fpage>3</fpage>–<lpage>32</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1213_ref_010">
<mixed-citation publication-type="book"> <string-name><surname>Fleiss</surname> <given-names>JL</given-names></string-name>, <string-name><surname>Levin</surname> <given-names>B</given-names></string-name>, <string-name><surname>Paik</surname> <given-names>MC</given-names></string-name> (<year>2013</year>). <source><italic>Statistical Methods for Rates and Proportions</italic></source>. <publisher-name>John Wiley &amp; Sons</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1213_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Gajewski</surname> <given-names>BJ</given-names></string-name>, <string-name><surname>Hart</surname> <given-names>S</given-names></string-name>, <string-name><surname>Bergquist-Beringer</surname> <given-names>S</given-names></string-name>, <string-name><surname>Dunton</surname> <given-names>N</given-names></string-name> (<year>2007</year>). <article-title>Inter-rater reliability of pressure ulcer staging: Ordinal probit Bayesian hierarchical model that allows for uncertain rater response</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>26</volume>(<issue>25</issue>): <fpage>4602</fpage>–<lpage>4618</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/sim.2877" xlink:type="simple">https://doi.org/10.1002/sim.2877</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_012">
<mixed-citation publication-type="book"> <string-name><surname>Gelman</surname> <given-names>A</given-names></string-name>, <string-name><surname>Carlin</surname> <given-names>JB</given-names></string-name>, <string-name><surname>Stern</surname> <given-names>HS</given-names></string-name>, <string-name><surname>Rubin</surname> <given-names>DB</given-names></string-name> (<year>1995</year>). <source><italic>Bayesian Data Analysis</italic></source>. <publisher-name>Chapman and Hall/CRC</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1213_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Giraudeau</surname> <given-names>B</given-names></string-name>, <string-name><surname>Mary</surname> <given-names>J</given-names></string-name> (<year>2001</year>). <article-title>Planning a reproducibility study: How many subjects and how many replicates per subject for an expected width of the 95 per cent confidence interval of the intraclass correlation coefficient</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>20</volume>(<issue>21</issue>): <fpage>3205</fpage>–<lpage>3214</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/sim.935" xlink:type="simple">https://doi.org/10.1002/sim.935</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Hallgren</surname> <given-names>KA</given-names></string-name> (<year>2012</year>). <article-title>Computing inter-rater reliability for observational data: An overview and tutorial</article-title>. <source><italic>Tutorials in Quantitative Methods for Psychology</italic></source>, <volume>8</volume>(<issue>1</issue>): <fpage>23</fpage>–<lpage>34</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.20982/tqmp.08.1.p023" xlink:type="simple">https://doi.org/10.20982/tqmp.08.1.p023</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Konishi</surname> <given-names>S</given-names></string-name> (<year>1985</year>). <article-title>Normalizing and variance stabilizing transformations for intraclass correlations</article-title>. <source><italic>Annals of the Institute of Statistical Mathematics</italic></source>, <volume>37</volume>(<issue>1</issue>): <fpage>87</fpage>–<lpage>94</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/BF02481082" xlink:type="simple">https://doi.org/10.1007/BF02481082</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Müller</surname> <given-names>R</given-names></string-name>, <string-name><surname>Büttner</surname> <given-names>P</given-names></string-name> (<year>1994</year>). <article-title>A critical discussion of intraclass correlation coefficients</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>13</volume>(<issue>23–24</issue>): <fpage>2465</fpage>–<lpage>2476</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/sim.4780132310" xlink:type="simple">https://doi.org/10.1002/sim.4780132310</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Nelson</surname> <given-names>KP</given-names></string-name>, <string-name><surname>Edwards</surname> <given-names>D</given-names></string-name> (<year>2015</year>). <article-title>Measures of agreement between many raters for ordinal classifications</article-title>. <source><italic>Statistics in Medicine</italic></source>, <volume>34</volume>(<issue>23</issue>): <fpage>3116</fpage>–<lpage>3132</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/sim.6546" xlink:type="simple">https://doi.org/10.1002/sim.6546</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>Olkin</surname> <given-names>I</given-names></string-name>, <string-name><surname>Lou</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Stokes</surname> <given-names>L</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>J</given-names></string-name> (<year>2015</year>). <article-title>Analyses of wine-tasting data: A tutorial</article-title>. <source><italic>Journal of Wine Economics</italic></source>, <volume>10</volume>(<issue>1</issue>): <fpage>4</fpage>–<lpage>30</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1017/jwe.2014.26" xlink:type="simple">https://doi.org/10.1017/jwe.2014.26</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_019">
<mixed-citation publication-type="journal"> <string-name><surname>Shrout</surname> <given-names>PE</given-names></string-name>, <string-name><surname>Fleiss</surname> <given-names>JL</given-names></string-name> (<year>1979</year>). <article-title>Intraclass correlations: Uses in assessing rater reliability</article-title>. <source><italic>Psychological Bulletin</italic></source>, <volume>86</volume>(<issue>2</issue>): <fpage>420</fpage>–<lpage>428</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1037/0033-2909.86.2.420" xlink:type="simple">https://doi.org/10.1037/0033-2909.86.2.420</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Tran</surname> <given-names>QD</given-names></string-name>, <string-name><surname>Demirhan</surname> <given-names>H</given-names></string-name>, <string-name><surname>Dolgun</surname> <given-names>A</given-names></string-name> (<year>2021</year>). <article-title>Bayesian approaches to the weighted kappa-like inter-rater agreement measures</article-title>. <source><italic>Statistical Methods in Medical Research</italic></source>, <volume>30</volume>(<issue>10</issue>): <fpage>2329</fpage>–<lpage>2351</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/09622802211037068" xlink:type="simple">https://doi.org/10.1177/09622802211037068</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Van Oest</surname> <given-names>R</given-names></string-name>, <string-name><surname>Girard</surname> <given-names>JM</given-names></string-name> (<year>2022</year>). <article-title>Weighting schemes and incomplete data: A generalized Bayesian framework for chance-corrected interrater agreement</article-title>. <source><italic>Psychological Methods</italic></source>, <volume>27</volume>(<issue>6</issue>): <fpage>1069</fpage>–<lpage>1088</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1213_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Von Rosen</surname> <given-names>D</given-names></string-name> (<year>1988</year>). <article-title>Moments for the inverted Wishart distribution</article-title>. <source><italic>Scandinavian Journal of Statistics</italic></source>, <volume>15</volume>(<issue>2</issue>): <fpage>97</fpage>–<lpage>109</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1213_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Wang</surname> <given-names>C</given-names></string-name>, <string-name><surname>Yandell</surname> <given-names>B</given-names></string-name>, <string-name><surname>Rutledge</surname> <given-names>J</given-names></string-name> (<year>1991</year>). <article-title>Bias of maximum likelihood estimator of intraclass correlation</article-title>. <source><italic>Theoretical and Applied Genetics</italic></source>, <volume>82</volume>(<issue>4</issue>): <fpage>421</fpage>–<lpage>424</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/BF00588594" xlink:type="simple">https://doi.org/10.1007/BF00588594</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Yue</surname> <given-names>C</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sair</surname> <given-names>HI</given-names></string-name>, <string-name><surname>Airan</surname> <given-names>R</given-names></string-name>, <string-name><surname>Caffo</surname> <given-names>BS</given-names></string-name> (<year>2015</year>). <article-title>Estimating a graphical intra-class correlation coefficient (GICC) using multivariate probit-linear mixed models</article-title>. <source><italic>Computational Statistics &amp; Data Analysis</italic></source>, <volume>89</volume>: <fpage>126</fpage>–<lpage>133</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.csda.2015.02.012" xlink:type="simple">https://doi.org/10.1016/j.csda.2015.02.012</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_025">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Ahn</surname> <given-names>C</given-names></string-name> (<year>2018</year>). <article-title>Sample size calculation for before–after experiments with partially overlapping cohorts</article-title>. <source><italic>Contemporary Clinical Trials</italic></source>, <volume>64</volume>: <fpage>274</fpage>–<lpage>280</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.cct.2015.09.015" xlink:type="simple">https://doi.org/10.1016/j.cct.2015.09.015</ext-link></mixed-citation>
</ref>
<ref id="j_jds1213_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name> (<year>2021</year>). <article-title>A note on Wishart and inverse Wishart priors for covariance matrix</article-title>. <source><italic>Journal of Behavioral Data Science</italic></source>, <volume>1</volume>(<issue>2</issue>): <fpage>119</fpage>–<lpage>126</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.35566/jbds/v1n2/p2" xlink:type="simple">https://doi.org/10.35566/jbds/v1n2/p2</ext-link></mixed-citation>
</ref>
</ref-list>
</back>
</article>
