8 November 2023, 15:00
Reproducible research is necessary to ensure that scientific work can be
trusted. Funders and publishers are beginning to require that publications
include access to the underlying data and the analysis code. The goal is to
ensure that all results can be independently verified and built upon in future
work. This is sometimes easier said than done! Sharing these research
outputs means understanding data management, library sciences, software
development, and continuous integration techniques: skills that are not widely
taught or expected of academic researchers. A particularly steep barrier to
working with codebases is setting up computational environments, and getting
the combination of package versions just right can influence the
reproducibility of code: from outright failures, to subtle changes in
generated outputs. There are many tools available to manage your
computational environment; but in this talk, we’ll explore Project Binder and
its subproject repo2docker, which aims to automate reproducibility best
practices across a number of ecosystems. Binder can build portable
computational environments, when requested, with all the information encoded
in a single, clickable URL, which greases the wheels of collaborative research
while reducing the toil involved. We will discuss how these concepts can
apply to the HPC community.
Videos brought to you by the video team at Institut Agro and published under CC-BY-NC 3.0. Guix artwork by Luis Felipe published under CC-BY-SA 4.0.