Pub. online:13 Mar 2024Type:Statistical Data ScienceOpen Access
Journal:Journal of Data Science
Volume 22, Issue 2 (2024): Special Issue: 2023 Symposium on Data Science and Statistics (SDSS): “Inquire, Investigate, Implement, Innovate”, pp. 280–297
Abstract
The use of visuals is a key component in scientific communication. Decisions about the design of a data visualization should be informed by what design elements best support the audience’s ability to perceive and understand the components of the data visualization. We build on the foundations of Cleveland and McGill’s work in graphical perception, employing a large, nationally-representative, probability-based panel of survey respondents to test perception in stacked bar charts. Our findings provide actionable guidance for data visualization practitioners to employ in their work.
Pub. online:21 Dec 2022Type:Data Science In ActionOpen Access
Journal:Journal of Data Science
Volume 21, Issue 2 (2023): Special Issue: Symposium Data Science and Statistics 2022, pp. 239–254
Abstract
The 2020 Census County Assessment Tool was developed to assist decennial census data users in identifying deviations between expected census counts and the released counts across population and housing indicators. The tool also offers contextual data for each county on factors which could have contributed to census collection issues, such as self-response rates and COVID-19 infection rates. The tool compiles this information into a downloadable report and includes additional local data sources relevant to the data collection process and experts to seek more assistance.
Social determinants of health (SDOH) are the conditions in which people are born, grow, work, and live. Although evidence suggests that SDOH influence a range of health outcomes, health systems lack the infrastructure to access and act upon this information. The purpose of this manuscript is to explain the methodology that a health system used to: 1) identify and integrate publicly available SDOH data into the health systems’ Data Warehouse, 2) integrate a HIPAA compliant geocoding software (via DeGAUSS), and 3) visualize data to inform SDOH projects (via Tableau). First, authors engaged key stakeholders across the health system to convey the implications of SDOH data for our patient population and identify variables of interest. As a result, fourteen publicly available data sets, accounting for >30,800 variables representing national, state, county, and census tract information over 2016–2019, were cleaned and integrated into our Data Warehouse. To pilot the data visualization, we created county and census tract level maps for our service areas and plotted common SDOH metrics (e.g., income, education, insurance status, etc.). This practical, methodological integration of SDOH data at a large health system demonstrated feasibility. Ultimately, we will repeat this process system wide to further understand the risk burden in our patient population and improve our prediction models – allowing us to become better partners with our community.