<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1062</article-id>
<article-id pub-id-type="doi">10.6339/22-JDS1062</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Data Science in Action</subject></subj-group></article-categories>
<title-group>
<article-title>A Joint Analysis for Field Goal Attempts and Percentages of Professional Basketball Players: Bayesian Nonparametric Resource</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Wong-Toi</surname><given-names>Eliot</given-names></name><xref ref-type="aff" rid="j_jds1062_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Yang</surname><given-names>Hou-Cheng</given-names></name><xref ref-type="aff" rid="j_jds1062_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Shen</surname><given-names>Weining</given-names></name><xref ref-type="aff" rid="j_jds1062_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Hu</surname><given-names>Guanyu</given-names></name><email xlink:href="mailto:guanyu.hu@missouri.edu">guanyu.hu@missouri.edu</email><xref ref-type="aff" rid="j_jds1062_aff_003">3</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_jds1062_aff_001"><label>1</label>Department of Statistics, <institution>University of California Irvine</institution>, <country>USA</country></aff>
<aff id="j_jds1062_aff_002"><label>2</label>Department of Statistics, <institution>Florida State University</institution>, <country>USA</country></aff>
<aff id="j_jds1062_aff_003"><label>3</label>Department of Statistics, <institution>University of Missouri</institution>, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:guanyu.hu@missouri.edu">guanyu.hu@missouri.edu</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2023</year></pub-date><pub-date pub-type="epub"><day>9</day><month>8</month><year>2022</year></pub-date><volume>21</volume><issue>1</issue><fpage>68</fpage><lpage>86</lpage><supplementary-material id="S1" content-type="document" xlink:href="jds1062_s001.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<title>Supplementary Material</title>
<p>The real data and R code used to reproduce the results in this paper can be found <uri>https://github.com/ewongtoi/nba_shot_charts</uri>. Additional tables are presented in the Online Supplementary Material.</p>
</caption>
</supplementary-material><history><date date-type="received"><day>9</day><month>1</month><year>2022</year></date><date date-type="accepted"><day>17</day><month>7</month><year>2022</year></date></history>
<permissions><copyright-statement>2023 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2023</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Understanding shooting patterns among different players is a fundamental problem in basketball game analyses. In this paper, we quantify the shooting pattern via the field goal attempts and percentages over twelve non-overlapping regions around the front court. A joint Bayesian nonparametric mixture model is developed to find latent clusters of players based on their shooting patterns. We apply our proposed model to learn the heterogeneity among selected players from the National Basketball Association (NBA) games over the 2018–2019 regular season and 2019–2020 bubble season. Thirteen clusters are identified for 2018–2019 regular season and seven clusters are identified for 2019–2020 bubble season. We further examine the shooting patterns of players in these clusters and discuss their relation to players’ other available information. The results shed new insights on the effect of NBA COVID bubble and may provide useful guidance for player’s shot selection and team’s in-game and recruiting strategy planning.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>Chinese restaurant process</kwd>
<kwd>mixture model</kwd>
<kwd>shot charts data</kwd>
<kwd>spatial spline</kwd>
<kwd>sport analytics</kwd>
</kwd-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1062_reflist_001">
<title>References</title>
<ref id="j_jds1062_ref_001">
<mixed-citation publication-type="chapter"> <string-name><surname>Aldous</surname> <given-names>DJ</given-names></string-name> (<year>1985</year>). <chapter-title>Exchangeability and related topics</chapter-title>. In: <source><italic>École d’Été de Probabilités de Saint-Flour XIII—1983</italic></source>, <fpage>1</fpage>–<lpage>198</lpage>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Antoniak</surname> <given-names>CE</given-names></string-name> (<year>1974</year>). <article-title>Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems</article-title>. <source><italic>The Annals of Statistics</italic></source>, <volume>2</volume>(<issue>6</issue>): <fpage>1152</fpage>–<lpage>1174</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_003">
<mixed-citation publication-type="journal"> <string-name><surname>Blackwell</surname> <given-names>D</given-names></string-name>, <string-name><surname>MacQueen</surname> <given-names>JB</given-names></string-name>, <etal>et al.</etal> (<year>1973</year>). <article-title>Ferguson distributions via Pólya urn schemes</article-title>. <source><italic>The Annals of Statistics</italic></source>, <volume>1</volume>(<issue>2</issue>): <fpage>353</fpage>–<lpage>355</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_004">
<mixed-citation publication-type="chapter"> <string-name><surname>Bradley</surname> <given-names>JR</given-names></string-name>, <string-name><surname>Cressie</surname> <given-names>N</given-names></string-name>, <string-name><surname>Shi</surname> <given-names>T</given-names></string-name> (<year>2011</year>). <chapter-title>Selection of rank and basis functions in the spatial random effects model</chapter-title>. In: <source><italic>Proceedings of the 2011 Joint Statistical Meetings</italic></source>, <fpage>3393</fpage>–<lpage>3406</lpage>. <publisher-name>American Statistical Association</publisher-name>, <publisher-loc>Alexandria, VA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Bradley</surname> <given-names>JR</given-names></string-name>, <string-name><surname>Holan</surname> <given-names>SH</given-names></string-name>, <string-name><surname>Wikle</surname> <given-names>CK</given-names></string-name> (<year>2015</year>). <article-title>Multivariate spatio-temporal models for high-dimensional areal data with application to longitudinal employer-household dynamics</article-title>. <source><italic>The Annals of Applied Statistics</italic></source>, <volume>9</volume>(<issue>4</issue>): <fpage>1761</fpage>–<lpage>1791</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_006">
<mixed-citation publication-type="chapter"> <string-name><surname>Dahl</surname> <given-names>DB</given-names></string-name>, <string-name><surname>Vannucci</surname> <given-names>M</given-names></string-name> (<year>2006</year>). <chapter-title>Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model</chapter-title>. In: <source><italic>Bayesian Inference for Gene Expression and Proteomics</italic></source> (<string-name><given-names>KA</given-names> <surname>Do</surname></string-name>, <string-name><given-names>P</given-names> <surname>Muller</surname></string-name>, eds.), <fpage>201</fpage>–<lpage>218</lpage>. <publisher-name>Cambridge University Press</publisher-name>, <publisher-loc>Cambridge</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_007">
<mixed-citation publication-type="book"> <string-name><surname>de Valpine</surname> <given-names>P</given-names></string-name>, <string-name><surname>Paciorek</surname> <given-names>C</given-names></string-name>, <string-name><surname>Turek</surname> <given-names>D</given-names></string-name>, <string-name><surname>Michaud</surname> <given-names>N</given-names></string-name>, <string-name><surname>Anderson-Bergman</surname> <given-names>C</given-names></string-name>, <string-name><surname>Obermeyer</surname> <given-names>F</given-names></string-name>, <etal>et al.</etal> (<year>2021</year>a). <source><italic>NIMBLE: MCMC, Particle Filtering, and Programmable Hierarchical Modeling</italic></source>. <comment>R package version 0.12.1</comment>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_008">
<mixed-citation publication-type="book"> <string-name><surname>de Valpine</surname> <given-names>P</given-names></string-name>, <string-name><surname>Paciorek</surname> <given-names>C</given-names></string-name>, <string-name><surname>Turek</surname> <given-names>D</given-names></string-name>, <string-name><surname>Michaud</surname> <given-names>N</given-names></string-name>, <string-name><surname>Anderson-Bergman</surname> <given-names>C</given-names></string-name>, <string-name><surname>Obermeyer</surname> <given-names>F</given-names></string-name>, <etal>et al.</etal> (<year>2021</year>b). <source><italic>NIMBLE User Manual</italic></source>. <comment>R package manual version 0.12.1</comment>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_009">
<mixed-citation publication-type="journal"> <string-name><surname>de Valpine</surname> <given-names>P</given-names></string-name>, <string-name><surname>Turek</surname> <given-names>D</given-names></string-name>, <string-name><surname>Paciorek</surname> <given-names>C</given-names></string-name>, <string-name><surname>Anderson-Bergman</surname> <given-names>C</given-names></string-name>, <string-name><surname>Temple Lang</surname> <given-names>D</given-names></string-name>, <string-name><surname>Bodik</surname> <given-names>R</given-names></string-name> (<year>2017</year>). <article-title>Programming with models: writing statistical algorithms for general model structures with NIMBLE</article-title>. <source><italic>Journal of Computational and Graphical Statistics</italic></source>, <volume>26</volume>: <fpage>403</fpage>–<lpage>413</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Ferguson</surname> <given-names>TS</given-names></string-name> (<year>1973</year>). <article-title>A Bayesian analysis of some nonparametric problems</article-title>. <source><italic>The Annals of Statistics</italic></source>, <volume>1</volume>(<issue>2</issue>): <fpage>209</fpage>–<lpage>230</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_011">
<mixed-citation publication-type="journal"> <string-name><surname>Franks</surname> <given-names>A</given-names></string-name>, <string-name><surname>Miller</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bornn</surname> <given-names>L</given-names></string-name>, <string-name><surname>Goldsberry</surname> <given-names>K</given-names></string-name> (<year>2015</year>). <article-title>Characterizing the spatial structure of defensive skill in professional basketball</article-title>. <source><italic>The Annals of Applied Statistics</italic></source>, <volume>9</volume>(<issue>1</issue>): <fpage>94</fpage>–<lpage>121</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Hu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>HC</given-names></string-name>, <string-name><surname>Xue</surname> <given-names>Y</given-names></string-name> (<year>2021</year>a). <article-title>Bayesian group learning for shot selection of professional basketball players</article-title>. <source><italic>Stat</italic></source>, <volume>10</volume>(<issue>1</issue>): <fpage>e324</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Hu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>HC</given-names></string-name>, <string-name><surname>Xue</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Dey</surname> <given-names>DK</given-names></string-name> (<year>2021</year>b). <article-title>Zero inflated poisson model with clustered regression coefficients: an application to heterogeneity learning of field goal attempts of professional basketball players</article-title>. <source><italic>Canadian Journal of Statistics.</italic></source></mixed-citation>
</ref>
<ref id="j_jds1062_ref_014">
<mixed-citation publication-type="journal"> <string-name><surname>Ibrahim</surname> <given-names>JG</given-names></string-name>, <string-name><surname>Chu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>LM</given-names></string-name> (<year>2010</year>). <article-title>Basic concepts and methods for joint models of longitudinal and survival data</article-title>. <source><italic>Journal of Clinical Oncology</italic></source>, <volume>28</volume>(<issue>16</issue>): <fpage>2796</fpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_015">
<mixed-citation publication-type="journal"> <string-name><surname>Jiao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Yan</surname> <given-names>J</given-names></string-name> (<year>2021</year>). <article-title>A bayesian marked spatial point processes model for basketball shot chart</article-title>. <source><italic>Journal of Quantitative Analysis in Sports</italic></source>, <volume>17</volume>(<issue>2</issue>): <fpage>77</fpage>–<lpage>90</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Li</surname> <given-names>H</given-names></string-name>, <string-name><surname>Calder</surname> <given-names>CA</given-names></string-name>, <string-name><surname>Cressie</surname> <given-names>N</given-names></string-name> (<year>2007</year>). <article-title>Beyond moran’s i: testing for spatial dependence based on the spatial autoregressive model</article-title>. <source><italic>Geographical Analysis</italic></source>, <volume>39</volume>(<issue>4</issue>): <fpage>357</fpage>–<lpage>375</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_017">
<mixed-citation publication-type="other"> <string-name><surname>Lim</surname> <given-names>J</given-names></string-name> (2021). Incorporating Bayesian variable selection into the spatial mixed effects models with process augmentation: Bayesian wavelet neural network (bwnn), Ph.D. thesis, The Florida State University.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_018">
<mixed-citation publication-type="journal"> <string-name><surname>Long</surname> <given-names>JD</given-names></string-name>, <string-name><surname>Mills</surname> <given-names>JA</given-names></string-name> (<year>2018</year>). <article-title>Joint modeling of multivariate longitudinal data and survival data in several observational studies of huntington’s disease</article-title>. <source><italic>BMC medical research methodology</italic></source>, <volume>18</volume>(<issue>1</issue>): <fpage>1</fpage>–<lpage>15</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_019">
<mixed-citation publication-type="chapter"> <string-name><surname>Miller</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bornn</surname> <given-names>L</given-names></string-name>, <string-name><surname>Adams</surname> <given-names>R</given-names></string-name>, <string-name><surname>Goldsberry</surname> <given-names>K</given-names></string-name> (<year>2014</year>). <chapter-title>Factorized point process intensities: A spatial analysis of professional basketball</chapter-title>. In: <source><italic>International conference on machine learning</italic></source>, <fpage>235</fpage>–<lpage>243</lpage>. <publisher-name>PMLR</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_020">
<mixed-citation publication-type="journal"> <string-name><surname>Miller</surname> <given-names>JW</given-names></string-name>, <string-name><surname>Harrison</surname> <given-names>MT</given-names></string-name> (<year>2018</year>). <article-title>Mixture models with a prior on the number of components</article-title>. <source><italic>Journal of the American Statistical Association</italic></source>, <volume>113</volume>(<issue>521</issue>): <fpage>340</fpage>–<lpage>356</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Moran</surname> <given-names>PA</given-names></string-name> (<year>1950</year>). <article-title>Notes on continuous stochastic phenomena</article-title>. <source><italic>Biometrika</italic></source>, <volume>37</volume>(<issue>1/2</issue>): <fpage>17</fpage>–<lpage>23</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_022">
<mixed-citation publication-type="journal"> <string-name><surname>Neal</surname> <given-names>RM</given-names></string-name> (<year>2000</year>). <article-title>Markov chain sampling methods for Dirichlet process mixture models</article-title>. <source><italic>Journal of Computational and Graphical Statistics</italic></source>, <volume>9</volume>(<issue>2</issue>): <fpage>249</fpage>–<lpage>265</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Pitman</surname> <given-names>J</given-names></string-name> (<year>1995</year>). <article-title>Exchangeable and partially exchangeable random partitions</article-title>. <source><italic>Probability Theory and Related Fields</italic></source>, <volume>102</volume>(<issue>2</issue>): <fpage>145</fpage>–<lpage>158</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_024">
<mixed-citation publication-type="journal"> <string-name><surname>Reich</surname> <given-names>BJ</given-names></string-name>, <string-name><surname>Hodges</surname> <given-names>JS</given-names></string-name>, <string-name><surname>Carlin</surname> <given-names>BP</given-names></string-name>, <string-name><surname>Reich</surname> <given-names>AM</given-names></string-name> (<year>2006</year>). <article-title>A spatial analysis of basketball shot chart data</article-title>. <source><italic>The American Statistician</italic></source>, <volume>60</volume>(<issue>1</issue>): <fpage>3</fpage>–<lpage>12</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_025">
<mixed-citation publication-type="other"> <string-name><surname>Xu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Bradley</surname> <given-names>JR</given-names></string-name>, <string-name><surname>Sinha</surname> <given-names>D</given-names></string-name> (2019). Latent multivariate log-gamma models for high-dimensional multi-type responses with application to daily fine particulate matter and mortality counts. arXiv preprint: <uri>https://arxiv.org/abs/1909.02528</uri>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_026">
<mixed-citation publication-type="journal"> <string-name><surname>Yang</surname> <given-names>HC</given-names></string-name>, <string-name><surname>Bradley</surname> <given-names>JR</given-names></string-name> (<year>2022</year>). <article-title>Bayesian inference for spatial count data that may be over-dispersed or under-dispersed with application to the 2016 us presidential election</article-title>. <source><italic>Journal of Data Science</italic></source>, <volume>forthcoming</volume>: <fpage>1</fpage>–<lpage>13</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_027">
<mixed-citation publication-type="journal"> <string-name><surname>Yin</surname> <given-names>F</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>W</given-names></string-name> (<year>2022</year>a). <article-title>Analysis of professional basketball field goal attempts via a bayesian matrix clustering approach</article-title>. <source><italic>Journal of Computational and Graphical Statistics</italic></source>, <volume>forthcoming</volume>: <fpage>1</fpage>–<lpage>23</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1062_ref_028">
<mixed-citation publication-type="chapter"> <string-name><surname>Yin</surname> <given-names>F</given-names></string-name>, <string-name><surname>Jiao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>G</given-names></string-name>, <string-name><surname>Yan</surname> <given-names>J</given-names></string-name> (<year>2022</year>b). <chapter-title>Bayesian nonparametric estimation for point processes with spatial homogeneity: A spatial analysis of nba shot locations</chapter-title>. In: <source><italic>Proceedings of the 39th International Conference on Machine Learning (ICML)</italic></source>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
