RCloud – Collaborative Visualization and Analysis Platform
Pub. online: 12 December 2024
Type: Computing In Data Science
Open Access
Received
31 July 2023
31 July 2023
Accepted
6 September 2024
6 September 2024
Published
12 December 2024
12 December 2024
Abstract
The last decade has seen a vast increase of the abundance of data, fuelling the need for data analytic tools that can keep up with the data size and complexity. This has changed the way we analyze data: moving from away from single data analysts working on their individual computers, to large clusters and distributed systems leveraged by dozens of data scientists. Technological advances have been addressing the scalability aspects, however, the resulting complexity necessitates that more people are involved in a data analysis than before. Collaboration and leveraging of other’s work becomes crucial in the modern, interconnected world of data science. In this article we propose and describe an open-source, web-based, collaborative visualization and data analysis platform RCloud. It de-couples the user from the location of the data analysis while preserving security, interactivity and visualization capabilities. Its collaborative features enable data scientists to explore, work together and share analyses in a seamless fashion. We describe the concepts and design decisions that enabled it to support large data science teams in the industry and academia.
Supplementary material
Supplementary Material
•
Source code repository and documentation: https://github.com/att/rcloud
•
Public instance and tutorials: https://rcloud.social
References
GitHub (2020). https://github.com/.
Redis (2020). https://redis.io/.
The Apache Software Foundation (2020a). Apache Lucene. https://lucene.apache.org/.
The Apache Software Foundation (2020b). Apache Solr. https://solr.apache.org/.