hpc:slurm
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| hpc:slurm [2024/11/20 09:03] – [GPGPU jobs] Yann Sagon | hpc:slurm [2025/07/25 10:18] (current) – [spart] Yann Sagon | ||
|---|---|---|---|
| Line 68: | Line 68: | ||
| * Special public partitions: | * Special public partitions: | ||
| * '' | * '' | ||
| - | * '' | + | * '' |
| * '' | * '' | ||
| * '' | * '' | ||
| Line 113: | Line 113: | ||
| ^ Partition | ^ Partition | ||
| |debug-cpu | |debug-cpu | ||
| - | |debug-gpu |15 Minutes | + | |public-interactive-gpu |4 hours |
| |public-interactive-cpu |8 hours |10GB | | |public-interactive-cpu |8 hours |10GB | | ||
| |public-longrun-cpu | |public-longrun-cpu | ||
| Line 126: | Line 126: | ||
| Minimum resource is one core. | Minimum resource is one core. | ||
| - | N.B. : no '' | + | N.B. : no '' |
| Line 137: | Line 137: | ||
| ^ Partition | ^ Partition | ||
| | private-< | | private-< | ||
| - | |||
| - | To see details about a given partition, go to the web page https:// | ||
| - | If you belong in one of these groups, please contact us to request to have access to the correct partition as we have to manually add you. | ||
| - | |||
| Line 768: | Line 764: | ||
| If you want other information please see the sacct manpage. | If you want other information please see the sacct manpage. | ||
| + | <note tip>by default the command displays a lot of fields. You can use this trick to display them correctly. Then you can move with left and right arrows to see the remaining fields | ||
| + | < | ||
| + | (yggdrasil)-[root@admin1 ~]$ sstat -j 39919765 --all | less -#2 -N -S | ||
| + | 1 JobID | ||
| + | 2 ------------ ---------- -------------- -------------- ---------- ---------- ---------- ---------- ---------- -------- ------------ -------------- ---------- ---------- ---------- ---------- ---------- -------- ---------- -------> | ||
| + | 3 39919765.ex+ | ||
| + | 4 39919765.ba+ | ||
| + | </ | ||
| + | |||
| + | </ | ||
| ===== Energy usage ===== | ===== Energy usage ===== | ||
| ==== CPUs ==== | ==== CPUs ==== | ||
| Line 795: | Line 801: | ||
| </ | </ | ||
| - | ===== Job history ===== | ||
| - | You can see your job history using '' | ||
| - | < | ||
| - | [sagon@master ~] $ sacct -u $USER -S 2021-04-01 | ||
| - | | ||
| - | ------------ ---------- ---------- ---------- ---------- ---------- -------- | ||
| - | 45517641 | ||
| - | 45517641.ba+ | ||
| - | 45517641.ex+ | ||
| - | 45517641.0 | ||
| - | 45518119 | ||
| - | 45518119.ba+ | ||
| - | 45518119.ex+ | ||
| - | </ | ||
| - | |||
| - | |||
| - | |||
| - | ===== Report and statistics with sreport ===== | ||
| - | |||
| - | To get reporting about your past jobs, you can use '' | ||
| - | |||
| - | Here are some examples that can give you a starting point : | ||
| - | |||
| - | To get the number of jobs you ran (you <=> '' | ||
| - | |||
| - | <code console> | ||
| - | [brero@login2 ~]$ sreport job sizesbyaccount user=$USER PrintJobCount start=2018-01-01 end=2019-01-01 | ||
| - | |||
| - | -------------------------------------------------------------------------------- | ||
| - | Job Sizes 2018-01-01T00: | ||
| - | Units are in number of jobs ran | ||
| - | -------------------------------------------------------------------------------- | ||
| - | Cluster | ||
| - | --------- --------- ------------- ------------- ------------- ------------- ------------- ------------ | ||
| - | | ||
| - | </ | ||
| - | |||
| - | You can see how many jobs were run (grouped by allocated CPU). You can also see we specified an extra day for the //end date// '' | ||
| - | < | ||
| - | |||
| - | You can also check how much CPU time (seconds) you have used on the cluster between since 2019-09-01 : | ||
| - | |||
| - | <code console> | ||
| - | [brero@login2 ~]$ sreport cluster AccountUtilizationByUser user=$USER start=2019-09-01 -t Seconds | ||
| - | -------------------------------------------------------------------------------- | ||
| - | Cluster/ | ||
| - | Usage reported in CPU Seconds | ||
| - | -------------------------------------------------------------------------------- | ||
| - | Cluster | ||
| - | --------- --------------- --------- --------------- -------- -------- | ||
| - | | ||
| - | </ | ||
| - | |||
| - | In this example, we added the time '' | ||
| - | |||
| - | Please note : | ||
| - | * By default, the CPU time is in Minutes | ||
| - | * It takes up to an hour for Slurm to upate this information in its database, so be patient | ||
| - | * If you don't specify a start, nor an end date, yesterday' | ||
| - | * The CPU time is the time that was allocated to you. It doesn' | ||
| - | |||
| - | Tip : If you absolutely need a report including your job that ran on the same day, you can override the default end date by forcing tomorrow' | ||
| - | |||
| - | < | ||
| - | sreport cluster AccountUtilizationByUser user=$USER start=2019-09-01 end=$(date --date=" | ||
| - | </ | ||
| Line 867: | Line 807: | ||
| ==== spart ==== | ==== spart ==== | ||
| + | |||
| + | <note warning> | ||
| '' | '' | ||
hpc/slurm.1732093382.txt.gz · Last modified: (external edit)