User Tools

Site Tools


hpc:faq

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
hpc:faq [2022/04/19 14:03]
Yann Sagon [Troubleshooting]
hpc:faq [2023/12/13 08:37] (current)
Yann Sagon [Can I run interactive tasks ?]
Line 23: Line 23:
 What to do: be sure you aren't the cause. Check with ''htop'' on the login node. If you see that all the cpus are in use, please take a screenshot and send it to us at hpc@unige.ch. What to do: be sure you aren't the cause. Check with ''htop'' on the login node. If you see that all the cpus are in use, please take a screenshot and send it to us at hpc@unige.ch.
  
 +===== Account =====
 +
 +==== When does my account expire ====
 +  * If you have a non student account (Phd, postdoc, researcher), your account will expire at the same time your contract expire at UNIGE. Right now, there is a grace period after the end of your contract of around 6 months.
 +  * If you have an outsider account, you need to check the expiration date you received when you filled the invitation.
 +  * If you have an unige student account, you can check the expiration date with the ''chage'' command:
 +<code>
 +(baobab)-[yourusername@login2 ~]$ chage -l yourusername
 +Last password change                                    : Apr 01, 2022
 +Password expires                                        : never
 +Password inactive                                       : never
 +Account expires                                         : never
 +Minimum number of days between password change          : 0
 +Maximum number of days between password change          : 99999
 +Number of days of warning before password expires       : 7
 +</code>
 +
 +==== I'm leaving UNIGE, can I continue to use Baobab HPC service? ====
 +Yes it is possible as long as you collaborate tightly with your former research group. Your PI must invite you as [[hpc:access_the_hpc_clusters#outsider_account|outsider]]. For technical reason, your account needs to be expired prior doing the request for the invitation.
 +We'll then reactivate your account. You'll keep your data.
 ===== Storage ===== ===== Storage =====
  
Line 94: Line 114:
 See [[hpc/slurm#interactive_jobs|Interactive jobs]] See [[hpc/slurm#interactive_jobs|Interactive jobs]]
  
 +==== I'm not able to use all the cores of a compute node ====
 +Indeed, we are reserving two cores per node for system tasks such as data transfer, and os stuff.
 +
 +<code>
 +(yggdrasil)-[root@admin1 ~]$ scontrol show node cpu001
 +NodeName=cpu001 Arch=x86_64 CoresPerSocket=18
 +   CPUAlloc=0 CPUEfctv=34 CPUTot=36 CPULoad=0.01
 +   AvailableFeatures=GOLD-6240,XEON_GOLD_6240,V9
 +   ActiveFeatures=GOLD-6240,XEON_GOLD_6240,V9
 +   Gres=(null)
 +   NodeAddr=cpu001 NodeHostName=cpu001 Version=23.02.1
 +   OS=Linux 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Tue May 16 11:38:37 UTC 2023
 +   RealMemory=187000 AllocMem=0 FreeMem=185338 Sockets=2 Boards=1
 +   CoreSpecCount=2 CPUSpecList=17,35 <==================== this means we have two specialization cores <<<<
 +   State=IDLE ThreadsPerCore=1 TmpDisk=150000 Weight=10 Owner=N/A MCS_label=N/A
 +   Partitions=debug-cpu
 +   BootTime=2023-08-10T12:08:11 SlurmdStartTime=2023-08-10T12:09:00
 +   LastBusyTime=2023-08-11T10:06:42 ResumeAfterTime=None
 +   CfgTRES=cpu=34,mem=187000M,billing=34
 +   AllocTRES=
 +   CapWatts=n/a
 +   CurrentWatts=0 AveWatts=0
 +   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
 +
 +</code>
 +
 +If you really need to use all the cores of a compute node, you can override this parameter: ''--core-spec=0''. This will implicitly lead to an exclusive allocation of the node.
 +
 +ref: https://slurm.schedmd.com/core_spec.html
  
  
hpc/faq.1650369830.txt.gz · Last modified: 2022/04/19 14:03 by Yann Sagon