From analysis-to-communication: tools for data-driven science with integrity
Interactive data analysis has been made possible by coding ecosystems, such as tidyverse in R and pandas in Python, that support iterative dialogue between data and models. The analysis process leaves behind a diverse trail of pathways explored, decisions made, failed attempts and also conclusions that satisfactory outcomes have been reached, i.e., when to stop an exploration? How can these analysis trails be documented, how can the provenance of these processes be recorded and serve as evidence and eventually as communicative artefacts?
This theme will focus on tools and technologies through which exploratory data analysis workflows can be documented, communicated and shared. Here we will explore the space of computational notebook environments – how they can be used to record and communicate interactive data analysis processes. Drawing on work from themes #1 and #2 we will identify examples of interactive data analyses, documented via computational notebooks, that balance complexity of information with claims to knowledge.