Introduction to the Command Line for Genomics: Instructor Notes

Lesson motivation and learning objectives

Many researchers making the transition into genomics research (whether from another field or as their first research project) have not had prior experience with command-line tools. They may quickly run into situations in which they need to use command-line tools either because there is no good alternative for the type of anaysis they want to do or because they have so many data files that doing things manually on individual files is unfeasible.

This lesson will introduce learners to fundamental skills needed for working with their computers through a command-line interface (using the bash shell). They will learn how to navigate their file system, computationally manipulate their files (e.g. copying, moving, renaming), search files, redirect output and write shell scripts. By the end of the lesson, learners will be prepared to move on to using more advanced bioinformatic command line tools (see the lesson on Data Wrangling and Processing).

Lesson design

This lesson is meant to be taught in its entirety. For novice learners, schedule around 4 hours for this lesson. If your learners are already somewhat familiar with the bash shell, the earlier episodes can be condensed.

This lesson uses data hosted on an Amazon Machine Instance (AMI). Instructors will be sent information on how to log-in to the AMI by the workshop coordinator a few days before the workshop. If you are running a self-organized workshop, register the workshop with our self-organized workshop form and send us an email at mailto:team@datacarpentry.org with information on how many people you expect to have at the workshop, and we’ll start instances for you to use in the workshop. The day before the workshop, we’ll send you the login information for your learners.

Technical tips and tricks

Common problems

Learners will work through an Amazon Web Service (AWS) instance for this lesson. The workshop coordinator will set up AWS instances for your workshop a few days ahead of time. Put the links for all instances on your workshop Etherpad and have learners put their name next to the instance they will use. This prevents learners from accidentally messing up another learner’s filesystem.

The workshop coordinator usually sets up more AWS instances than needed for the registered learners. If a learner accidentally deletes or overwrites data files, you can have them change to a different AWS instance.