<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn><issn pub-type="ppub">1680-743X</issn><issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1153</article-id>
<article-id pub-id-type="doi">10.6339/24-JDS1153</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Computing in Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>RCloud – Collaborative Visualization and Analysis Platform</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Urbanek</surname><given-names>Simon</given-names></name><email xlink:href="mailto:urbanek@R-project.org">urbanek@R-project.org</email><xref ref-type="aff" rid="j_jds1153_aff_001">1</xref>
</contrib>
<aff id="j_jds1153_aff_001"><label>1</label>Department of Statistics, <institution>University of Auckland</institution>, Auckland, <country>New Zealand</country></aff>
</contrib-group>
<pub-date pub-type="ppub"><year>2025</year></pub-date><pub-date pub-type="epub"><day>12</day><month>12</month><year>2024</year></pub-date><volume>23</volume><issue>2</issue><fpage>389</fpage><lpage>398</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1153_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>
<list>
<list-item id="j_jds1153_li_001">
<label>•</label>
<p>Source code repository and documentation: <uri>https://github.com/att/rcloud</uri></p>
</list-item>
<list-item id="j_jds1153_li_002">
<label>•</label>
<p>Public instance and tutorials: <uri>https://rcloud.social</uri></p>
</list-item>
</list> 
</p>
</caption>
</supplementary-material><history><date date-type="received"><day>31</day><month>7</month><year>2023</year></date><date date-type="accepted"><day>6</day><month>9</month><year>2024</year></date></history>
<permissions><copyright-statement>2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2025</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>The last decade has seen a vast increase of the abundance of data, fuelling the need for data analytic tools that can keep up with the data size and complexity. This has changed the way we analyze data: moving from away from single data analysts working on their individual computers, to large clusters and distributed systems leveraged by dozens of data scientists. Technological advances have been addressing the scalability aspects, however, the resulting complexity necessitates that more people are involved in a data analysis than before. Collaboration and leveraging of other’s work becomes crucial in the modern, interconnected world of data science. In this article we propose and describe an open-source, web-based, collaborative visualization and data analysis platform RCloud. It de-couples the user from the location of the data analysis while preserving security, interactivity and visualization capabilities. Its collaborative features enable data scientists to explore, work together and share analyses in a seamless fashion. We describe the concepts and design decisions that enabled it to support large data science teams in the industry and academia.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>cloud computing</kwd>
<kwd>collaboration</kwd>
<kwd>data analysis</kwd>
<kwd>data science</kwd>
<kwd>distributed computing</kwd>
<kwd>reproducible research</kwd>
<kwd>visualization</kwd>
</kwd-group>
<funding-group><funding-statement>Current work and <ext-link ext-link-type="uri" xlink:href="https://rcloud.social">public instance</ext-link> is supported by the University of Auckland and the Centre for eResearch.</funding-statement></funding-group>
</article-meta>
</front>
<back>
<ref-list id="j_jds1153_reflist_001">
<title>References</title>
<ref id="j_jds1153_ref_001">
<mixed-citation publication-type="book"> <string-name><surname>Chacon</surname> <given-names>S</given-names></string-name>, <string-name><surname>Straub</surname> <given-names>B</given-names></string-name> (<year>2014</year>). <source><italic>Pro Git</italic></source>. <publisher-name>Apress</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_002">
<mixed-citation publication-type="book"> <string-name><surname>Garman</surname> <given-names>J</given-names></string-name> (<year>2003</year>). <source><italic>Kerberos: The Definitive Guide</italic></source>. <publisher-name>O’Reilly Media</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_003">
<mixed-citation publication-type="other"> <string-name><surname>GitHub</surname></string-name> (<year>2020</year>). <uri>https://github.com/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_004">
<mixed-citation publication-type="journal"> <string-name><surname>Gupta</surname> <given-names>M</given-names></string-name>, <string-name><surname>George</surname> <given-names>JF</given-names></string-name> (<year>2016</year>). <article-title>Toward the development of a big data analytics capability</article-title>. <source><italic>Information &amp; Management</italic></source>, <volume>53</volume>(<issue>8</issue>): <fpage>1049</fpage>–<lpage>1064</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_005">
<mixed-citation publication-type="journal"> <string-name><surname>Hilbert</surname> <given-names>M</given-names></string-name>, <string-name><surname>López</surname> <given-names>P</given-names></string-name> (<year>2011</year>). <article-title>The world’s technological capacity to store, communicate, and compute information</article-title>. <source><italic>Science</italic></source>, <volume>332</volume>: <fpage>60</fpage>–<lpage>65</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_006">
<mixed-citation publication-type="other"> Jupyter Development Team (<year>2015</year>). Messaging in Jupyter.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_007">
<mixed-citation publication-type="chapter"> <string-name><surname>Kluyver</surname> <given-names>T</given-names></string-name>, <string-name><surname>Ragan-Kelley</surname> <given-names>B</given-names></string-name>, <string-name><surname>Pérez</surname> <given-names>F</given-names></string-name>, <string-name><surname>Granger</surname> <given-names>B</given-names></string-name>, <string-name><surname>Bussonnier</surname> <given-names>M</given-names></string-name>, <string-name><surname>Frederic</surname> <given-names>J</given-names></string-name>, <etal>et al.</etal> (<year>2016</year>). <chapter-title>Jupyter notebooks – a publishing format for reproducible computational workflows</chapter-title>. In: <source><italic>Positioning and Power in Academic Publishing: Players, Agents and Agendas</italic></source> (<string-name><given-names>F</given-names> <surname>Loizides</surname></string-name>, <string-name><given-names>B</given-names> <surname>Schmidt</surname></string-name>, eds.), <fpage>87</fpage>–<lpage>90</lpage>. <publisher-name>IOS Press</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_008">
<mixed-citation publication-type="journal"> <string-name><surname>Knuth</surname> <given-names>DE</given-names></string-name> (<year>1984</year>). <article-title>Literate programming</article-title>. <source><italic>The Computer Journal</italic></source>, <volume>27</volume>: <fpage>97</fpage>–<lpage>111</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_009">
<mixed-citation publication-type="other"> <string-name><surname>Miller</surname> <given-names>MS</given-names></string-name> (<year>2006</year>). Robust composition: Towards a unified approach to access control and concurrency control, Ph.D. thesis, Johns Hopkins University, Baltimore, Maryland, USA.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_010">
<mixed-citation publication-type="chapter"> <string-name><surname>North</surname> <given-names>S</given-names></string-name>, <string-name><surname>Scheidegger</surname> <given-names>C</given-names></string-name>, <string-name><surname>Urbanek</surname> <given-names>S</given-names></string-name>, <string-name><surname>Woodhull</surname> <given-names>G</given-names></string-name> (<year>2015</year>). <chapter-title>Collaborative visual analysis with RCloud</chapter-title>. In: <source><italic>2015 IEEE Conference on Visual Analytics Science and Technology (VAST)</italic></source>, <fpage>25</fpage>–<lpage>32</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_011">
<mixed-citation publication-type="book"> <string-name><surname>Odersky</surname> <given-names>M</given-names></string-name>, <string-name><surname>Spoon</surname> <given-names>L</given-names></string-name>, <string-name><surname>Venners</surname> <given-names>B</given-names></string-name> (<year>2008</year>). <source><italic>Programming in Scala</italic></source>. <publisher-name>Artima</publisher-name>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Pérez</surname> <given-names>F</given-names></string-name>, <string-name><surname>Granger</surname> <given-names>BE</given-names></string-name> (<year>2007</year>). <article-title>IPython: a system for interactive scientific computing</article-title>. <source><italic>Computing in Science &amp; Engineering</italic></source>, <volume>9</volume>(<issue>3</issue>): <fpage>21</fpage>–<lpage>29</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_013">
<mixed-citation publication-type="book"> <collab>R Core Team</collab> (<year>2022</year>). <source><italic>R: A Language and Environment for Statistical Computing</italic></source>. <publisher-name>R Foundation for Statistical Computing</publisher-name>, <publisher-loc>Vienna, Austria</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_014">
<mixed-citation publication-type="other"> Redis (<year>2020</year>). <uri>https://redis.io/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_015">
<mixed-citation publication-type="book"> <string-name>RStudio Team</string-name> (<year>2020</year>). <source><italic>RStudio: Integrated Development Environment for R</italic></source>. <publisher-name>RStudio, PBC.</publisher-name>, <publisher-loc>Boston, MA</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_016">
<mixed-citation publication-type="journal"> <string-name><surname>Sandve</surname> <given-names>GK</given-names></string-name>, <string-name><surname>Nekrutenko</surname> <given-names>A</given-names></string-name>, <string-name><surname>Taylor</surname> <given-names>J</given-names></string-name>, <string-name><surname>Hovig</surname> <given-names>E</given-names></string-name> (<year>2013</year>). <article-title>Ten simple rules for reproducible computational research</article-title>. <source><italic>PLoS Computational Biology</italic></source>, <volume>9</volume>(<issue>10</issue>): <elocation-id>e1003285</elocation-id>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_017">
<mixed-citation publication-type="other"> <string-name><surname>Stagg</surname> <given-names>GW</given-names></string-name> <string-name><surname>Henry</surname> <given-names>L</given-names></string-name>, (<year>2024</year>). webr: The statistical language R compiled to WebAssembly via Emscripten.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_018">
<mixed-citation publication-type="other"> The Apache Software Foundation (2020a). Apache Lucene. <uri>https://lucene.apache.org/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_019">
<mixed-citation publication-type="other"> The Apache Software Foundation (2020b). Apache Solr. <uri>https://solr.apache.org/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_020">
<mixed-citation publication-type="other"> <string-name><surname>Tuloup</surname> <given-names>J</given-names></string-name>, <string-name><surname>Tandon</surname> <given-names>M</given-names></string-name>, <string-name><surname>Renou</surname> <given-names>M</given-names></string-name>, <string-name><surname>Beier</surname> <given-names>T</given-names></string-name> (<year>2021</year>). Jupyterlite: Wasm powered Jupyter running in the browser.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_021">
<mixed-citation publication-type="chapter"> <string-name><surname>Urbanek</surname> <given-names>S</given-names></string-name> (<year>2003</year>). <chapter-title>Rserve – a fast way to provide R functionality to applications</chapter-title>. In: <source><italic>Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003)</italic></source>.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_022">
<mixed-citation publication-type="other"> <string-name><surname>Vaidyanathan</surname> <given-names>R</given-names></string-name>, <string-name><surname>Xie</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Allaire</surname> <given-names>J</given-names></string-name>, <string-name><surname>Cheng</surname> <given-names>J</given-names></string-name>, <string-name><surname>Sievert</surname> <given-names>C</given-names></string-name>, <string-name><surname>Russell</surname> <given-names>K</given-names></string-name> (<year>2021</year>). <italic>htmlwidgets: HTML Widgets for R</italic>. R package version 1.5.4.</mixed-citation>
</ref>
<ref id="j_jds1153_ref_023">
<mixed-citation publication-type="book"> <string-name><surname>Van Rossum</surname> <given-names>G</given-names></string-name>, <string-name><surname>Drake</surname> <given-names>FL</given-names></string-name> (<year>2009</year>). <source><italic>Python 3 Reference Manual</italic></source>. <publisher-name>CreateSpace</publisher-name>, <publisher-loc>Scotts Valley, CA</publisher-loc>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
