This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
hpc:hpc_clusters [2023/05/15 13:48] Yann Sagon [Private nodes] |
hpc:hpc_clusters [2024/04/09 16:07] Yann Sagon [Price per hour] |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | {{METATOC 1-5}} | ||
+ | |||
====== How our clusters work ====== | ====== How our clusters work ====== | ||
Line 27: | Line 29: | ||
^ cluster name ^ datacentre ^ Interconnect ^ public CPU ^ public GPU ^ Total CPU size ^ Total GPU size ^ | ^ cluster name ^ datacentre ^ Interconnect ^ public CPU ^ public GPU ^ Total CPU size ^ Total GPU size ^ | ||
- | | Baobab | + | | Baobab |
- | | Yggdrasil | + | | Yggdrasil |
Line 45: | Line 47: | ||
All those servers (login, compute, management and storage nodes) : | All those servers (login, compute, management and storage nodes) : | ||
- | * run with the GNU/Linux distribution [[https://www.centos.org/|CentOS]]. | + | * run with the GNU/Linux distribution [[https://rockylinux.org/|Rocky]]. |
* are inter-connected on high speed InfiniBand network | * are inter-connected on high speed InfiniBand network | ||
* 40Gbit/s (QDR) for Baobab. | * 40Gbit/s (QDR) for Baobab. | ||
Line 67: | Line 69: | ||
that will use only CPU or GPU nodes. | that will use only CPU or GPU nodes. | ||
- | ===== Private nodes ===== | + | ===== Cost model ===== |
- | <note important> | + | <note important> |
We are currently in the process of implementing changes to the investment approach for the HPC service Baobab, wherein research groups will no longer purchase physical nodes as their property. Instead, they will have the option to pay for a share and duration of usage. This new approach offers several advantages for both the research groups and us as the service provider. | We are currently in the process of implementing changes to the investment approach for the HPC service Baobab, wherein research groups will no longer purchase physical nodes as their property. Instead, they will have the option to pay for a share and duration of usage. This new approach offers several advantages for both the research groups and us as the service provider. | ||
Line 78: | Line 80: | ||
In cases where research groups have already purchased compute nodes, we offer them the opportunity to convert their ownership into credits for shares. We estimate that a compute node typically lasts for at least 6 years under normal conditions, and this conversion option ensures that the value of their existing investment is not lost. | In cases where research groups have already purchased compute nodes, we offer them the opportunity to convert their ownership into credits for shares. We estimate that a compute node typically lasts for at least 6 years under normal conditions, and this conversion option ensures that the value of their existing investment is not lost. | ||
</ | </ | ||
+ | |||
+ | ==== Price per hour ==== | ||
+ | Overview: | ||
+ | {{: | ||
+ | |||
+ | You can find the whole table that you can send to the FNS {{: | ||
+ | |||
+ | ==== Private nodes ==== | ||
Research groups can buy " | Research groups can buy " | ||
Line 85: | Line 95: | ||
Rules: | Rules: | ||
* The compute node remains the research group property | * The compute node remains the research group property | ||
- | * The compute node has a three years warranty. If it fails after the warranty | + | * There is a three-year warranty |
* The research groups hasn't an admin right on it | * The research groups hasn't an admin right on it | ||
* The compute node is installed and maintained by the HPC team in the same way as the other compute nodes | * The compute node is installed and maintained by the HPC team in the same way as the other compute nodes | ||
* The HPC team can decide to decommission the node when it is too old but the hardware will be in production for at least four years | * The HPC team can decide to decommission the node when it is too old but the hardware will be in production for at least four years | ||
+ | * | ||
+ | |||
+ | |||
See the [[hpc/ | See the [[hpc/ | ||
Line 190: | Line 203: | ||
See [[hpc:: | See [[hpc:: | ||
+ | |||
+ | |||
+ | ==== Bamboo (coming soon) ==== | ||
+ | |||
+ | ^ Generation ^ Model ^ Freq ^ Nb cores ^ Architecture | ||
+ | | V8 | EPYC-7742 | 2.25GHz | 128 cores| " | ||
+ | | V8 | EPYC-72F3 | 3.7GHz | ||
+ | |||
+ | |||
+ | ^ GPU model ^ Architecture ^ Mem ^ Compute Capability ^ Slurm resource ^ Nb per node ^ Nodes ^ Peer access between GPUs ^ | ||
+ | | RTX 3090 | Ampere | ||
+ | | A100 | Ampere | ||
+ | |||
==== Baobab ==== | ==== Baobab ==== | ||
Line 197: | Line 223: | ||
Since our clusters are regularly expanded, the nodes are not all from the same generation. You can see the details in the following table. | Since our clusters are regularly expanded, the nodes are not all from the same generation. You can see the details in the following table. | ||
- | ^ Generation ^ Model ^ Freq ^ Nb cores ^ Architecture | + | ^ Generation ^ Model ^ Freq ^ Nb cores ^ Architecture |
- | | V2 | X5650 | 2.67GHz | 12 cores | " | + | | V2 | X5650 | 2.67GHz | 12 cores | " |
- | | V3 | E5-2660V0 | 2.20GHz | 16 cores | "Sandy Bridge-EP" | + | | V3 | E5-2660V0 | 2.20GHz | 16 cores | "Sandy Bridge-EP" |
- | | V3 | E5-2670V0 | 2.60GHz | 16 cores | "Sandy Bridge-EP" | + | | V3 | E5-2660V0 | 2.20GHz | 16 cores | "Sandy Bridge-EP" |
- | | V3 | E5-4640V0 | 2.40GHz | 32 cores | "Sandy Bridge-EP" | + | | V3 | E5-2670V0 | 2.60GHz | 16 cores | "Sandy Bridge-EP" |
- | | V4 | E5-2650V2 | 2.60GHz | 16 cores | "Ivy Bridge-EP" | + | | V3 | E5-4640V0 | 2.40GHz | 32 cores | "Sandy Bridge-EP" |
- | | V5 | E5-2643V3 | 3.40GHz | 12 cores | " | + | | V4 | E5-2650V2 | 2.60GHz | 16 cores | "Ivy Bridge-EP" |
- | | V6 | E5-2630V4 | 2.20GHz | 20 cores | " | + | | V5 | E5-2643V3 | 3.40GHz | 12 cores | " |
- | | ::: | + | | V6 | E5-2630V4 | 2.20GHz | 20 cores | " |
- | | V6 | E5-2637V4 | 3.50GHz | 8 cores | " | + | | ::: |
- | | V6 | E5-2643V4 | 3.40GHz | 12 cores | " | + | | V6 | E5-2637V4 | 3.50GHz | 8 cores | " |
- | | V6 | E5-2680V4 | 2.40GHz | 28 cores | " | + | | V6 | E5-2643V4 | 3.40GHz | 12 cores | " |
- | | V7 | EPYC-7601 | 2.20GHz | 64 cores | " | + | | V6 | E5-2680V4 | 2.40GHz | 28 cores | " |
- | | V8 | EPYC-7742 | 2.25GHz | 128 cores| " | + | | V7 | EPYC-7601 | 2.20GHz | 64 cores | " |
- | | V9 | GOLD-6240 | 2.60GHz | 36 cores | " | + | | V8 | EPYC-7742 | 2.25GHz | 128 cores| " |
+ | | V9 | GOLD-6240 | 2.60GHz | 36 cores | " | ||
Line 240: | Line 267: | ||
| RTX 2080 Ti | Turing | | RTX 2080 Ti | Turing | ||
| RTX 2080 Ti | Turing | | RTX 2080 Ti | Turing | ||
- | | RTX 3090 | Ampere | + | | RTX 3090 | Ampere |
| RTX A5000 | Ampere | | RTX A5000 | Ampere | ||
| RTX 3080 | Ampere | | RTX 3080 | Ampere | ||
Line 246: | Line 273: | ||
| A100 | Ampere | | A100 | Ampere | ||
| A100 | Ampere | | A100 | Ampere | ||
- | | A100 | Ampere | ||
| A100 | Ampere | | A100 | Ampere | ||
+ | | A100 | Ampere | ||
| A100 | Ampere | | A100 | Ampere | ||
+ | | A100 | Ampere | ||
| | ||