You are here

Minerva Quick Start

Minerva Quick Start

The Minerva HPC complex is comprised of three major partitions:

  • The manda partition which is composed of 120 AMD nodes each with 64 AMD Interlagos (2.3 GHz) cores and 256GB of memory
  • The mothra partition which is composed of 209 IBM nodes each with 12 Intel Ivy Bridge (3.5 GHz) cores and 64GB of memory
  • The bode partition which is composed of 207 Cray nodes each with 12 Intel Haswell cores (2.4 GHz) and 64GB of memory
  • Note that only BODE-enabled users have access to the bode partition.

There also:

  • 8 GPGPU nodes each with 2 x NVIDIA Tesla K20Xm, 24 Intel cores and 256GB of memory
  • 1 GPGPU node with 4 x NVIDIA P100, 20 Intel Broadwell cores, 128 GB memory
  • 1 High-Memory node with 1.7 TB memory and 16 Intel Broadwell cores @ 3.2GHz

Connecting to Minerva

For security, Minerva uses the Secure Shell (ssh) protocol and Two Factor authentication.

Unix systems typically have an ssh client already installed. Windows systems can download
one of several ssh clients that are available for free such as PuTTY.
Two Factor authentication requires you to enter a password that is the combination of your Sinai password and a generated token.
Tokens can be obtained from anyone on the Scientific Computing staff or from the Mount Sinai Help Desk.

From on-site

If you are already on the Sinai campus network, you can login via ssh to

  • manda partition via ssh to manda.hpc.mssm.edu
  • mothra partition via ssh to mothra.hpc.mssm.edu
  • bode partition via ssh to bode.hpc.mssm.edu

For example, to log onto the manda login node, the syntax would be ( the > sign indicates what you would
type in):

> ssh your_userid@manda.hpc.mssm.edu
Password: > your_Sinai_password123456


123456 represents the numeric sequence obtained from your token.
From off-site

If you are off the Sinai campus network, you can only log into the manda partition. For this, the address is
minerva.hpc.mssm.edu. When you log in from outside, you will not be able to reach the Sinai campus network and, therefore,
some services will not be accessible.

> ssh your_userid@minerva.hpc.mssm.edu
Password: > your_Sinai_password123456

External groups and visiting faculty or students who are have a yubikey instead of a token should modify the ssh comand and password as follows:

> ssh your_userid+yldap@minerva.hpc.mssm.edu
Password: > your_Sinai_passwordYUBIKEY

YUBIKEY represents pushing the button on your yubikey while inserted into a USB port on your computer.

File System

All file systems are mounted on all nodes regardless of the underlying architecture.

/hpc/users/<userid> User HOME directories. 10GB quota. It is NOT purged and is backed up. Generally used for all the "rc" and configuration files
for various programs. It is slow.
/sc/orga/work/<userid> A work directory for each user. 100GB quota. It is NOT purged but is NOT backed up To be used
for whatever purpose the user desires.
/sc/orga/scratch/<userid> A folder for each user inside the /sc/orga/scratch directory. /sc/orga/scratch has a 300TB quota and
it is shared by all users. This should be used in lieu of /tmp for temporary files as well as short term storage up to a maximum of 14 days. Files older than 14 days are purged automatically by the system.
/sc/orga/projects/<projectid> A directory for each approved project. The quota is set to the approved allocation for the project.
It is NOT purged but is NOT backed up.

Queues

The queues that are available are:

Queue Description Default Wall Time Maximum Wall Time
alloc For normal priority allocated users 5h 144h (6d)
expressalloc For high priority, short jobs for allocated users 1h 2h
low For low priority jobs. No charge for this queue 5h 24h
premium For high priority jobs from allocated users 5h 144h
private For groups that have purchased private nodes. 5h owner determined

There are several other queues not listed which are for system testing only.

LSF

Minerva uses LSF for batch submission. bsub is the submission command. Options can be put on the command line or in the submission script. HOWEVER, if the options are placed in the submission script, you must feed the script into the bsub command via stdin for the options to be read: E.g.,

cat MyLSF.script | bsub
or
bsub < MyLSF.script

Some important points of interest:

  • The default disposition for output and logs is for LSF to email the output to you. This piece is not working yet so you must use the "-o" option to save the output.
  • In general, the shortest quantum of time in LSF is 1 minute. Wall time is expressed as HHH:MM -- There are no seconds. Durations are generally in minutes.
  • System level checkpoints are supported by LSF. There are some "gotchas" ( E.g., the default method does not work on our system) so check with the SC staff if you need/want to do checkpointing.

A quick conversion guide from the PBS qsub to the LSF bsub can be found here.

Some useful commands:

bjobs - shows all your jobs in the queue
bpeek - peek at your output before the job ends
bqueues - what queues are available
bkill - kill a job

Check out the main pages for all the options.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer