Tools for outbreak analytics infrastructures

/ [lectures]   / #rmarkdown #data science #reproducibility #outbreak #response 

Beyond the availability of data and methods, and the use of good practices for reproducible science, the outbreak response context poses a number of practical challenges for data analysis. In this lecture, we introduce tools which can help address some of these challenges, and create robust, efficient, and more easily deployable data analytics pipelines using R.


Click on the image below to access the slides:

click there for slides
Alternatively, you can view these slides directly on google slides.

Related packages


linelist provides data cleaning tools which address most of the common problems in epidemiological data. While tailored for case data (hence the name linelist), it is very general and will likely be useful in many other contexts. Its main features include:

  • data standardisation: ensures consistent capitalisation, separators, and replaces non-ascii characters by their closest ascii match

  • guess dates: automatically detects dates, identify their formats, and performs required conversions

  • dictionary-based cleaning: applies cleaning rules to fix typos and recode variables according to a user-defined dictionary

For more information on this package, go to: https://repidemicsconsortium/linelist.

To install this package, type:



The reportfactory provides an infrastructure for handling multiple Rmd reports which need regular updating.

For more information on this package, go to:

To install this package, type:



rmarkdown extends the capabilities of knitr with a more diverse set of outputs generated from Rmd files, including pdf documents, article templates, pdf or html slides, or even web applications.

More information on rmarkdown is available from:

To install this package, type:


Other resources

Golden rules for writing analysis reports

These golden rules list several coding and statistical practices aimed at improving readability and robustness of analysis reports. Click on this link to download the current version, or visit this page for more information.

Report factory templates

This repository provides templates of report factories based on existing factories. Visit the github project for more information.

R4epi templates

The R4epi project provides several templates for epidemiological data analysis. Visit their website for more information.

About this document


  • Thibaut Jombart: initial version

Contributions are welcome via pull requests. The source files include:

License: CC-BY Copyright: Thibaut Jombart, 2017