Note: This is an update to my previous post: How to Run Snakemake pipeline on HPC.
In my previous post, I disucessed some tips on how to effectively manage workflow using Snakemake on an HPC system. However, I have recently noticed that Snakemake support for --cluster-config is offcially deprecated in favor of --profile. I spent most of today digging into this feature and now I’m happy to share with you my latest setup.
For years I have been sticking with R and RStudio, primarily because RStudio’s useful and user-friendly GUI design (and my own comfort with Tidyverse). Indeed, my biggest complain against Jupyter Notebook whenever somebody introduced it to me was its bare-bone functionality. Until I discovered Jupyter Lab!
I have been experimenting with Jupyter Lab and migrating some of my work there. And I’m happy to report that I’m fully ready to jump the ship and join team Jupyter at this point!
Why use Snakemake on HPC Snakemake is a handy workflow manager written in Python. It handles workflow based on predefined job dependencies. One of the great features of Snakemake is that it can manage a workflow on both a standalone computer, or a clustered HPC system. HPC, or “cluster” as it’s often referred to, requires additional considerations.
On HPC, all computing jobs should be submitted to “compute nodes” through a workload manager (for example, Slurm).