#grid





Bioinformatics workflows with Snakemake


Contributors

Tim Booth, Nathan Medd, Hywel Dunn-Davies, Katie Emelianova, Frances Turner.

Overview

Overall goals

  • To understand the core principles of Snakemake and what it’s useful for
  • To learn the syntax and features of the Snakemake language
  • To design, implement, optimise and debug Snakemake workflows
  • To be confident applying Snakemake to real-world analysis tasks

This lesson is designed to be covered over two days plus an extra session for a longer challenge.

Content

  • Running commands with Snakemake
    • How do I run a simple command with Snakemake?
  • Placeholders and Wildcards
    • How do I make a generic rule?
  • Chaining rules
    • How do I combine rules into a workflow?
    • How do I make a rule with multiple inputs and outputs?
  • Partial and remedial runs
    • How do I visualise a Snakemake workflow?
    • How does Snakemake avoid unecessary work?
    • How do I control what steps will be run?
  • Processing lists of inputs
    • How do I process multiple files at once?
    • How do I add splitting and combining steps?
    • How do I make Snakemake auto-detect inputs?
  • Quoting and error checking
    • How do I make my Snakefiles robust?
  • Configuring workflows
    • How do separate my rules from my configuration?
    • How do I make my workflows re-usable?
  • Speeding up workflows
    • How does Snakemake handle parallel execution and threads?
    • How can I make my workflow as fast as possible?
  • Conda integration
    • How do I use conda packages in conjunction with Snakemake

Longer challenge

We ask the participants to re-implement an existing workflow, provided as a shell script, as a Snakemake workflow.
This is supposed to be an extra day after the two days course, with an example based upon ChIP-Seq analysis.

Workshops

See all workshops

Calendar