Chapter 5 Computational skills
The data you are working with is too large to analyze by hand. For example, the human genome is 3.2 billion bases. It would be hard to compare all that data to data from other species just using your eyes. Instead we’re going to need to learn some basic programming skills so we can tell the computer how to work with our data.
First, the computer we’re using doesn’t have the graphical interface you usually use (like a PC or Mac). That means you have to type in your commands - you can’t point and click. This actually has a hidden benefit because it’s easy to write down what you did in a line of text rather than having to try to explain where to click on each step.
Let’s start by learning how to log in the our server and run commands without clicking.
5.1 Getting access
- Follow the directions for students to obtain an account on URI’s High Performance Computing Cluster (Unity).
- You will need to request an account under pi_rsschwartz_uri_edu
- Wait for your account to be approved
5.2 Log in to the server
- On the main Unity page select OpenOnDemand
- Select URI to log in
- At the top of the page select Shell - Unity Shell access
5.4 Data storage
Unlike your personal computer, Unity has multiple storage options. This is because you will have extremely large datasets and for long-term storage these should be stored in the most cost-effective way possible. However, for more immediate work you will need storage drives with optimal processing speeds. Because these drives have faster processing speeds they are also more expensive and therefore you will have less space on them. For this reason we may work on data in one place (faster speeds, but less space and more expensive) and then move our results to another location (more space, slower speed).
When you log in you should see something like the following information:
/home/rsschwartz_uri_edu: 146M (1%) of 50G
/work/pi_rsschwartz_uri_edu: 19G (2%) of 1000G
/project/pi_rsschwartz_uri_edu: 0 (0%) of 5.0T
Read Unity’s storage guide for additional information.
Following these guidelines you should use /work
for running your analyses and /project
for long-term storage.