hpc:slurm
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
hpc:slurm [2024/06/14 14:17] – [CPU] Yann Sagon | hpc:slurm [2025/04/08 17:05] (current) – [Clusters partitions] Adrien Albert | ||
---|---|---|---|
Line 68: | Line 68: | ||
* Special public partitions: | * Special public partitions: | ||
* '' | * '' | ||
- | * '' | + | * '' |
* '' | * '' | ||
* '' | * '' | ||
Line 113: | Line 113: | ||
^ Partition | ^ Partition | ||
|debug-cpu | |debug-cpu | ||
- | |debug-gpu |15 Minutes | + | |public-interactive-gpu |4 hours |
|public-interactive-cpu |8 hours |10GB | | |public-interactive-cpu |8 hours |10GB | | ||
|public-longrun-cpu | |public-longrun-cpu | ||
Line 126: | Line 126: | ||
Minimum resource is one core. | Minimum resource is one core. | ||
- | N.B. : no '' | + | N.B. : no '' |
Line 464: | Line 464: | ||
===== GPGPU jobs ===== | ===== GPGPU jobs ===== | ||
- | When we talk about [[https:// | + | When we talk about [[https:// |
You can see on this table [[hpc: | You can see on this table [[hpc: | ||
Line 502: | Line 502: | ||
#SBATCH --partition=shared-gpu | #SBATCH --partition=shared-gpu | ||
#SBATCH --gpus=1 | #SBATCH --gpus=1 | ||
- | #SBATCH --constraint=" | + | #SBATCH --constraint=" |
</ | </ | ||
Line 768: | Line 768: | ||
If you want other information please see the sacct manpage. | If you want other information please see the sacct manpage. | ||
+ | <note tip>by default the command displays a lot of fields. You can use this trick to display them correctly. Then you can move with left and right arrows to see the remaining fields | ||
+ | < | ||
+ | (yggdrasil)-[root@admin1 ~]$ sstat -j 39919765 --all | less -#2 -N -S | ||
+ | 1 JobID | ||
+ | 2 ------------ ---------- -------------- -------------- ---------- ---------- ---------- ---------- ---------- -------- ------------ -------------- ---------- ---------- ---------- ---------- ---------- -------- ---------- -------> | ||
+ | 3 39919765.ex+ | ||
+ | 4 39919765.ba+ | ||
+ | </ | ||
+ | |||
+ | </ | ||
===== Energy usage ===== | ===== Energy usage ===== | ||
==== CPUs ==== | ==== CPUs ==== | ||
Line 795: | Line 805: | ||
</ | </ | ||
- | ===== Job history ===== | ||
- | You can see your job history using '' | ||
- | < | ||
- | [sagon@master ~] $ sacct -u $USER -S 2021-04-01 | ||
- | | ||
- | ------------ ---------- ---------- ---------- ---------- ---------- -------- | ||
- | 45517641 | ||
- | 45517641.ba+ | ||
- | 45517641.ex+ | ||
- | 45517641.0 | ||
- | 45518119 | ||
- | 45518119.ba+ | ||
- | 45518119.ex+ | ||
- | </ | ||
- | |||
- | |||
- | |||
- | ===== Report and statistics with sreport ===== | ||
- | |||
- | To get reporting about your past jobs, you can use '' | ||
- | |||
- | Here are some examples that can give you a starting point : | ||
- | |||
- | To get the number of jobs you ran (you <=> '' | ||
- | |||
- | <code console> | ||
- | [brero@login2 ~]$ sreport job sizesbyaccount user=$USER PrintJobCount start=2018-01-01 end=2019-01-01 | ||
- | |||
- | -------------------------------------------------------------------------------- | ||
- | Job Sizes 2018-01-01T00: | ||
- | Units are in number of jobs ran | ||
- | -------------------------------------------------------------------------------- | ||
- | Cluster | ||
- | --------- --------- ------------- ------------- ------------- ------------- ------------- ------------ | ||
- | | ||
- | </ | ||
- | |||
- | You can see how many jobs were run (grouped by allocated CPU). You can also see we specified an extra day for the //end date// '' | ||
- | < | ||
- | |||
- | You can also check how much CPU time (seconds) you have used on the cluster between since 2019-09-01 : | ||
- | |||
- | <code console> | ||
- | [brero@login2 ~]$ sreport cluster AccountUtilizationByUser user=$USER start=2019-09-01 -t Seconds | ||
- | -------------------------------------------------------------------------------- | ||
- | Cluster/ | ||
- | Usage reported in CPU Seconds | ||
- | -------------------------------------------------------------------------------- | ||
- | Cluster | ||
- | --------- --------------- --------- --------------- -------- -------- | ||
- | | ||
- | </ | ||
- | |||
- | In this example, we added the time '' | ||
- | |||
- | Please note : | ||
- | * By default, the CPU time is in Minutes | ||
- | * It takes up to an hour for Slurm to upate this information in its database, so be patient | ||
- | * If you don't specify a start, nor an end date, yesterday' | ||
- | * The CPU time is the time that was allocated to you. It doesn' | ||
- | |||
- | Tip : If you absolutely need a report including your job that ran on the same day, you can override the default end date by forcing tomorrow' | ||
- | |||
- | < | ||
- | sreport cluster AccountUtilizationByUser user=$USER start=2019-09-01 end=$(date --date=" | ||
- | </ | ||
hpc/slurm.1718374637.txt.gz · Last modified: 2024/06/14 14:17 by Yann Sagon