User Tools

Site Tools


hpc:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
hpc:slurm [2024/12/05 14:59] Yann Sagonhpc:slurm [2025/07/25 10:18] (current) – [spart] Yann Sagon
Line 68: Line 68:
   * Special public partitions:   * Special public partitions:
     * ''debug-cpu'' - to test your CPU jobs and make sure everything works fine (max. 15 min)     * ''debug-cpu'' - to test your CPU jobs and make sure everything works fine (max. 15 min)
-    * ''debug-gpu'' - to test your GPU jobs and make sure everything works fine (max. 15 min)+    * ''public-interactive-gpu''Run interactive jobs or to test your GPU jobs and make sure everything works fine (max. 04h)
     * ''public-interactive-cpu'' - for interactive CPU jobs (max. of 6 cores for 8h)     * ''public-interactive-cpu'' - for interactive CPU jobs (max. of 6 cores for 8h)
     * ''public-longrun-cpu'' - for CPU jobs that don't need much resources, but need a longer runtime time (max. of 2 cores for 14 days)     * ''public-longrun-cpu'' - for CPU jobs that don't need much resources, but need a longer runtime time (max. of 2 cores for 14 days)
Line 113: Line 113:
 ^ Partition             ^Time Limit ^Max mem per core ^ ^ Partition             ^Time Limit ^Max mem per core ^
 |debug-cpu              |15 Minutes |full node memory | |debug-cpu              |15 Minutes |full node memory |
-|debug-gpu              |15 Minutes |full node memory |+|public-interactive-gpu |4 hours    |full node memory |
 |public-interactive-cpu |8 hours    |10GB             | |public-interactive-cpu |8 hours    |10GB             |
 |public-longrun-cpu     |14 Days    |10GB             | |public-longrun-cpu     |14 Days    |10GB             |
Line 126: Line 126:
 Minimum resource is one core. Minimum resource is one core.
  
-N.B. : no ''debug-gpu'', nor ''public-gpu'' partitions on Baobab, as there are only private GPU nodes.+N.B. : no ''public-interactive-gpu'', nor ''public-gpu'' partitions on Baobab, as there are only private GPU nodes.
  
  
Line 137: Line 137:
 ^ Partition             ^Time Limit^Max mem per core  ^default Mem Per core ^ ^ Partition             ^Time Limit^Max mem per core  ^default Mem Per core ^
 | private-<privatename> |7 Days    | full node memory | 3GB                 | | private-<privatename> |7 Days    | full node memory | 3GB                 |
- 
-To see details about a given partition, go to the web page https://baobab.unige.ch and click on the "status" tab. 
-If you belong in one of these groups, please contact us to request to have access to the correct partition as we have to manually add you. 
- 
  
  
Line 768: Line 764:
 If you want other information please see the sacct manpage. If you want other information please see the sacct manpage.
  
 +<note tip>by default the command displays a lot of fields. You can use this trick to display them correctly. Then you can move with left and right arrows to see the remaining fields
 +<code>
 +(yggdrasil)-[root@admin1 ~]$ sstat -j 39919765 --all | less -#2 -N -S
 +      1 JobID         MaxVMSize  MaxVMSizeNode  MaxVMSizeTask  AveVMSize     MaxRSS MaxRSSNode MaxRSSTask     AveRSS MaxPages MaxPagesNode   MaxPagesTask   AvePages     MinCPU MinCPUNode MinCPUTask     AveCPU   NTasks AveCPUFreq ReqCPUF>
 +      2 ------------ ---------- -------------- -------------- ---------- ---------- ---------- ---------- ---------- -------- ------------ -------------- ---------- ---------- ---------- ---------- ---------- -------- ---------- ------->
 +      3 39919765.ex+    489808K         cpu095              0      5584K      1728K     cpu095          0      1728K        0       cpu095              0          0   00:00:00     cpu095          0   00:00:00        1      2.80M        >
 +      4 39919765.ba+   1298188K         cpu095              0   1298188K    599588K     cpu095          0    599588K     2511       cpu095              0       2511   00:39:25     cpu095          0   00:39:25        1       984K        >
 +</code>
 +
 +</note>
 ===== Energy usage ===== ===== Energy usage =====
 ==== CPUs ==== ==== CPUs ====
Line 801: Line 807:
  
 ==== spart ==== ==== spart ====
 +
 +<note warning>This tool isn't working anymore and it seems a dead project</note>
  
 ''spart'' (( https://github.com/mercanca/spart )) is a tool to check the overall partition usage/description. ''spart'' (( https://github.com/mercanca/spart )) is a tool to check the overall partition usage/description.
hpc/slurm.1733410751.txt.gz · Last modified: (external edit)