/
How do I interact with Jobs in Real Time?

How do I interact with Jobs in Real Time?

Interactive Jobs

Batch jobs are submitted to slurm queuing system and runs when there is requested resource available. However, it can’t be used when user test and troubleshoot code in real time. Interactive jobs allow to interact with applications in real time. Users can then run graphical user interface (GUI) applications, execute scripts, or run other commands directly on a compute node.

Using srun command:

srun will submit your resource request to the queue. When the resource is available, a new bash session starts on reserved compute node. Same slurm flags are used for srun command.

Example:

srun -N 1 -n 4 -A users -p short --qos=users --gres=gpu:1 --mem=64G --time 1:00:00 --constraint=tesla_v100 --pty bash

By this command, slurm reserves 1 node, 4 cores, 64GB RAM, 1 gpu and constraint flag limits gpu type to tesla_v100 gpus with 1 hour time limit in short queue. Then, opens a terminal on compute node. If the terminal on compute node is closed, job is killed on queue.

Using salloc command:

salloc works same as

srun --pty bash

.  It will submit your resource request to queue. When the resource is available, it opens a terminal on the login node. However, you will have permission to ssh to reserved node.

 

Example: Same as in srun

salloc -N 1 -n 4 -A users -p short --qos=users --gres=gpu:1 --mem=64G --time 1:00:00 --constraint=tesla_v100

When resource is granted, need to find which node is reserved.

 

or