Logging in

These instructions are for accessing and submitting jobs through the Terminal.

If you would prefer to use a GUI, follow the instructions for getting started online

To login to your Tempest account through the Terminal, connect to the MSU VPN then use the following command:

ssh <net_id>@tempest-login.msu.montana.edu

If this is your first time using SSH to access Tempest, follow these instructions to set up an SSH key.

Submitting sbatch scripts

Tempest uses a cluster management platform called Slurm for managing jobs.

All significant computational workloads must be submitted as jobs. Jobs can be submitted using sbatch scripts, which specify the computational resources and software a job will use.

To submit a job, use the sbatch command. For example, to run the job in example.sbatch script in ~/slurm-examples/, you would use the following command:

sbatch slurm-examples/example.sbatch

To check the status of your jobs:

sacct

Below are template submission scripts that can be modified to submit diverse jobs  to CPU and GPU resources:

CPU Sbatch script

For submission to priority resources:

#!/bin/bash
##
## example-array.slurm.sh: submit an array of jobs with a varying parameter
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --account=priority-<group_name>   #specify the account to use
#SBATCH --job-name=<your_job_name>        # job name
#SBATCH --partition=priority              # queue partition to run the job in
#SBATCH --nodes=1                       # number of nodes to allocate
#SBATCH --ntasks-per-node=1             # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=2               # number of cores to allocate
#SBATCH --mem=2G                     # 2000 MB of Memory allocated; set --mem with care
#SBATCH --time=0-00:00:01                 # Maximum job run time
##SBATCH --array=1-3                   # Number of jobs in array
#SBATCH --output=<your_job_name>-%j.out
#SBATCH --error=<your_job_name>-%j.err

## Run 'man sbatch' for more information on the options above.

### Replace the below with modules and commands
date                            # print out the date
hostname -s                     # print a message from the compute node
date                            # print the date again

For submission to unsafe partition:

#!/bin/bash
##
## example-array.slurm.sh: submit an array of jobs with a varying parameter
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --job-name=<your_job_name>        # job name
#SBATCH --partition=unsafe               # queue partition to run the job in
#SBATCH --nodes=1                       # number of nodes to allocate
#SBATCH --ntasks-per-node=1             # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=2               # number of cores to allocate
#SBATCH --mem=2G                     # 2000 MB of Memory allocated; set --mem with care
#SBATCH --time=0-00:00:01                 # Maximum job run time
##SBATCH --array=1-3                   # Number of jobs in array
#SBATCH --output=<your_job_name>-%j.out
#SBATCH --error=<your_job_name>-%j.err

## Run 'man sbatch' for more information on the options above.

### Replace the below with modules and commands
date                            # print out the date
hostname -s                     # print a message from the compute node
date                            # print the date again

For submission to nextgen partition:

#!/bin/bash
##
## example-array.slurm.sh: submit an array of jobs with a varying parameter
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --job-name=<your_job_name>        # job name
#SBATCH --partition=nextgen               # use nextgen-long for jobs greater than 3 days
#SBATCH --nodes=1                       # number of nodes to allocate
#SBATCH --ntasks-per-node=1             # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=2               # number of cores to allocate
#SBATCH --mem=2G                     # 2000 MB of Memory allocated; set --mem with care
#SBATCH --time=0-00:00:01                 # Maximum job run time
##SBATCH --array=1-3                   # Number of jobs in array
#SBATCH --output=<your_job_name>-%j.out
#SBATCH --error=<your_job_name>-%j.err

## Run 'man sbatch' for more information on the options above.

### Replace the below with modules and commands
date                            # print out the date
hostname -s                     # print a message from the compute node
date                            # print the date again

GPU Sbatch script

For submission to priority partition:

#!/bin/bash
##
## gpuexample.sbatch submit a job using a GPU
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --account=priority-<group_name> # priority account to use
#SBATCH --job-name=<your_job_name>      # job name
#SBATCH --partition=gpupriority         # queue partition to run the job in
#SBATCH --nodes=1                       # number of nodes to allocate
#SBATCH --ntasks-per-node=1             # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=8   # number of cores to allocate - do not allocate more than 16 cores per GPU
#SBATCH --gpus-per-task=1   # number of GPUs to allocate - all GPUs are currently A40 model
#SBATCH --mem=2000                     # 2000 MB of Memory allocated - do not allocate more than 128000 MB mem per GPU
#SBATCH --time=1-00:10:00               # Maximum job run time (d-hh:mm:ss)
#SBATCH --output=<your_job_name>-%j.out  # standard output file (%j = jobid)
#SBATCH --error=<your_job_name>  -%j.err  # standard error file

## Run 'man sbatch' for more information on the options above.

### Replace the below with modules and commands
module load CUDA/11.1.1-GCC-10.2.0
echo "You are using CUDA version: "
nvcc --version

For submission to unsafe partition: 

#!/bin/bash
##
## gpuexample.sbatch submit a job using a GPU
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --job-name=<your_job_name>      # job name
#SBATCH --partition=gpuunsafe         # queue partition to run the job in
#SBATCH --nodes=1                       # number of nodes to allocate
#SBATCH --ntasks-per-node=1             # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=8   # number of cores to allocate - do not allocate more than 16 cores per GPU
#SBATCH --gpus-per-task=1   # number of GPUs to allocate - all GPUs are currently A40 model
#SBATCH --mem=2000                     # 2000 MB of Memory allocated - do not allocate more than 128000 MB mem per GPU
#SBATCH --time=1-00:10:00               # Maximum job run time (d-hh:mm:ss)
#SBATCH --output=<your_job_name>-%j.out # standard output file (%j = jobid)
#SBATCH --error=<your_job_name>-%j.err # standard error file

## Run 'man sbatch' for more information on the options above.
### Replace the below with modules and commands
module load CUDA/11.1.1-GCC-10.2.0
echo "You are using CUDA version: "
nvcc --version

For submission to nextgen GPU partition: 

#!/bin/bash
##
## gpuexample.sbatch submit a job using a GPU
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --job-name=<your_job_name>      # job name
#SBATCH --partition=nextgen-gpu # use nextgen-gpu-long for jobs greater than 3 days
#SBATCH --nodes=1                       # number of nodes to allocate
#SBATCH --ntasks-per-node=1             # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=8   # number of cores to allocate - do not allocate more than 16 cores per GPU
#SBATCH --gpus-per-task=1   # number of GPUs to allocate - all GPUs are currently A40 model
#SBATCH --mem=2000                     # 2000 MB of Memory allocated - do not allocate more than 128000 MB mem per GPU
#SBATCH --time=1-00:10:00               # Maximum job run time (d-hh:mm:ss)
#SBATCH --output=<your_job_name>-%j.out # standard output file (%j = jobid)
#SBATCH --error=<your_job_name>-%j.err # standard error file

## Run 'man sbatch' for more information on the options above.
### Replace the below with modules and commands
module load CUDA/11.1.1-GCC-10.2.0
echo "You are using CUDA version: "
nvcc --version