Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This is an example job script for KUACC HPC cluster. Note that a jobscript should start with #!/bin/bash.

Code Block
#!/bin/bash#SBATCHbash

#SBATCH --job-name=Test            
#SBATCH --nodes=1        
#SBATCH --ntasks-per-node=1    
#SBATCH --partition=short        
#SBATCH --qos=users        
#SBATCH --account=users    
#SBATCH --gres=gpu:tesla_t4:1    
#SBATCH --time=1:0:0        
#SBATCH --output=test-%j.out    
#SBATCH --mail-type=ALL#SBATCHALL
#SBATCH --mail-user=foo@bar.com     

module load python/3.6.1moddule1
moddule load cuda/11.4module4
module load 8.2.2/cuda-11.4     

python code.py

 

Jobscript can be divided into three sections.:

...

This section is where resources are requested and slurm parameters are configured. “#SBATCH” should always be used at the beginning of lines. Also, a flag is used for each request.

Code Block
#SBATCH <flag>
<flag>#SBATCH#SBATCH --job-name=Test                                 #Setting a job name
name#SBATCH#SBATCH --nodes=1                                       #Asking for only one node#SBATCHnode
#SBATCH --ntasks-per-node=1                             #Asking one core on each node, one core#SBATCHcore
#SBATCH --partition=short                               #Running on short queue(max 2hours)
#SBATCH --qos=users                                     #Running on users qos (rules and limits)
#SBATCH --account=users                                 #Running on users partitions(group of nodes)
#SBATCH --gres=gpu:tesla_t4:1                           #Asking a tesla_t4 GPU
GPU#SBATCH#SBATCH --time=1:0:0                                    #Reserving for one hour time limit.
#SBATCH --output=test-%j.out                            #Setting a output file name.
#SBATCH --mail-type=ALL                                 #All types all emails (BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=foo@bar.com                         #Where to send emails

...

Note that, KUACC HPC partitions are listed as below. You can see active partitions by sinfo command.

Name

MaxTimeLimit

Nodes

MaxJobs

MaxSubmitJob

short

2 hours

50 nodes

50

300

mid

1 days

45 nodes

35

200

long

7 days

5 nodes

25

100

longer

30 days

3 nodes

5

50

ai

7 days

16 nodes

8

100

ilac

Infinite

12 nodes

Infinite

Infinite

cosmos

Infinite

8 nodes

Infinite

Infinite

biyofiz

Infinite

4 nodes

Infinite

Infinite

cosbi

Infinite

1 node

Infinite

Infinite

kutem

Infinite

1 node

Infinite

Infinite

iui

Infinite

1 node

Infinite

Infinite

hamsi

Infinite

1 node

Infinite

Infinite

lufer

Infinite

1 node

Infinite

Infinite

shallowai

7 days

16 nodes

8

100

biyofiz_gpu

Infinite

4 nodes

Infinite

Infinite

kutem_gpu

Infinite

1 nodes

Infinite

Infinite

 

Note that, following flags can be used in your job scripts.

Note: All the flag syntax starts with two dashes, because of the editor we use you can see –, which is not the case.

Resource

Flag Syntax

Description

Notes

partition

–partition=short

Partition is a queue for jobs.

default on kuacc is short

qos

–qos=users

QOS is quality of service value (limits or priority boost)

default on kuacc is users

time

–time=01:00:00

Time limit for the job.

1 hour; default is 2 hours

nodes

–nodes=1

Number of compute nodes for the job.

default is 1

cpus/cores

–ntasks-per-node=4

Corresponds to number of cores on the compute node.

default is 1

resource feature

–gres=gpu:1

Request use of GPUs on compute nodes

default is no feature

memory

–mem=4096

Memory limit per compute node for the  job.  Do not use with mem-per-cpu flag.

default limit is 4096 MB per core

memory

–mem-per-cpu=14000

Per core memory limit.  Do not use the mem flag,

default limit is 4096 MB per core

account

–account=users

Users may belong to groups or accounts.

default is the user’s primary group.

job name

–job-name=”hello_test”

Name of job.

default is the JobID

constraint

–constraint=gpu

kuacc-nodes

AVAIL_FEATURES

output file

–output=test.out

Name of file for stdout.

default is the JobID

email address

–mail-user=username@ku.edu.tr

User’s email address

required

email notification

–mail-type=ALL

–mail-type=END

When email is sent to user.

omit for no email

Note: –mem ve –mem-per-cpu flags:

...

Code Block
#SBATCH --ntasks=5#SBATCH5
#SBATCH –-mem-per-cpu=20000

Total 5×20000=100000MB is reserved. For GB requests, use only G. Exp: 20G

...

Code Block
module load python/3.6.1module1
module load cuda/11.4module4
module load 8.2.2/cuda-11.4

For more information see the installing software modules page.

...

Code Block
sbatch jobscript.sh

 

Command

Description

sbatch

sbatch [script]

Submit a batch job

Example:
$ sbatch job.sub

scancel

scancel [job_id]

Kill a running job or cancel queued one

Example:
$ scancel 123456

squeue

squeue

List running or pending jobs

Example:
$ squeue

squeue -u userid

squeue -u [userid]

List running or pending jobs

Example:
$ squeue -u john