RADISH23
Reproducibility, Accessibility, Documentation and Inter-operability Standards Hackathon 2023
Background
The Reproducibility, Accessibility, Documentation and Inter-operability Standards Hackathon (RADISH23), organised by Bob Verity, Shazia Ruybal-Pesántez, Bryan Greenhouse and Amy Wesolowski took place at Johns Hopkins Bloomberg School of Public Health in Baltimore, USA from 11-14th December 2023, with 16 participants from 11 institutions.
For more details on RADISH23 attendees and contributors to PGEforge, see the Contributors page
Main aims
Our aim was to take the wide range of software tools in malaria genomic epidemiology and create community resources so that more people can use them more reliably. Over the course of 4 days, we began creating a systematic framework for analysis by curating existing software tools, identifying the gaps in commonly used tools, in addition to having broader discussions about standardizing software practices.
One of our main aims to create community resources that allowed anyone with basic computer skills to be able to run common Plasmodium genetic analyses locally. By framing this work within the wider context of use-cases, we made progress towards harmonizing which tools need to be chained together to answer specific questions relevant to malaria control as part of well-defined and flexible workflows.
Outputs
The event was primarily coding-based and hands-on with the aim that the materials developed for each tool will allow an end-user to go from installation of the tool on their local machine to analysis using the standardized datasets we compiled.
Prior to the hackathon, we carried out a scoping/landscaping exercise to identify all available analysis tools and evaluated them against a set of software standards, identifying those that are superseded or relegated and those we would prioritize during the hackathon. With those priority tools in mind, comprehensive guides were developed for the tools, including summary documents, installation aids and tutorials.
In addition, we compiled simulated and empirical datasets of genomic data in common formats required as input for the various tools to allow reproducibility for end-users running the tutorials and for future uses.
Alongside the coding tasks, there were several small break-out sessions where participants defined malaria genomic surveillance data analysis use cases and sketched workflows for each of them, including functionality requirements and mapping these functionalities to available tools.