Bioinformatics workflows with Snakemake

Contributors

Tim Booth, Nathan Medd, Hywel Dunn-Davies, Katie Emelianova, Frances Turner.

Overview

Overall goals

To understand the core principles of Snakemake and what it’s useful for
To learn the syntax and features of the Snakemake language
To design, implement, optimise and debug Snakemake workflows
To be confident applying Snakemake to real-world analysis tasks

This lesson is designed to be covered over two days plus an extra session for a longer challenge.

Content

Running commands with Snakemake
- How do I run a simple command with Snakemake?
Placeholders and Wildcards
- How do I make a generic rule?
Chaining rules
- How do I combine rules into a workflow?
- How do I make a rule with multiple inputs and outputs?
Partial and remedial runs
- How do I visualise a Snakemake workflow?
- How does Snakemake avoid unecessary work?
- How do I control what steps will be run?
Processing lists of inputs
- How do I process multiple files at once?
- How do I add splitting and combining steps?
- How do I make Snakemake auto-detect inputs?
Quoting and error checking
- How do I make my Snakefiles robust?
Configuring workflows
- How do separate my rules from my configuration?
- How do I make my workflows re-usable?
Speeding up workflows
- How does Snakemake handle parallel execution and threads?
- How can I make my workflow as fast as possible?
Conda integration
- How do I use conda packages in conjunction with Snakemake

Longer challenge

We ask the participants to re-implement an existing workflow, provided as a shell script, as a Snakemake workflow.
This is supposed to be an extra day after the two days course, with an example based upon ChIP-Seq analysis.

Workshops

See all workshops