Differences

This shows you the differences between two versions of the page.

--- hpc:hpc_clusters [2025/09/01 07:46] – [Cost of Renting a Compute Node] Yann Sagon
+++ hpc:hpc_clusters [2025/12/16 17:04] (current) – [Cost model] Yann Sagon
@@ Line 82: / Line 82: @@
   * **Purchase or rent** compute nodes for more intensive workloads.
-You can as well find a summary of how this model is implemented yet: https://hpc-community.unige.ch/t/hpc-accounting-summary/4056
+**Summary:**
+  * Starting this year, you receive a **CPU hours credit** based on the hardware you own (if any) in the cluster (private partition).
+  * You can find instructions on how to check your annual credit here: [[accounting#resources_available_for_research_group|Resources Available for Research Groups]]. If you know your research group has bought some compute nodes but your PI doesn't appear in the report, please contact us.
+  * The credit calculation in the provided script assumes a **5-year hardware ownership period**. However, **if** this policy was introduced after your compute nodes were purchased, we have extended the production duration by two years.
+  * To ensure **flexibility and simplicity**, we have standardized resource usage by converting CPU Memory, and GPU hours into CPU hours, using different conversion ratios depending on the GPU type. More details can be found here: [[accounting#resource_accounting_uniformization|Resource Accounting Uniformization]].
+  * You can use your credit across all three clusters (**Baobab, Yggdrasil, and Bamboo**), not just on your private compute nodes. However, when using your own compute nodes, you will receive a **higher priority**.
+  * To check your group's current resource usage, visit: [[accounting#report_and_statistics_with_sreport|Report and Statistics with sreport]].
 ==== Price per hour ====
 <WRAP center round important 60%>
@@ Line 98: / Line 106: @@
+=== Progressive Pricing for HPC Compute Hours ===
+A tiered pricing model applies to compute hour billing. Discounts increase as usage grows: once you reach 200K, 500K, and 1,000K compute hours, an additional 10% reduction is applied at each threshold. This ensures cost efficiency for large-scale workloads.
+^ Usage (Compute Hours) ^ Discount Applied ^
+| 0 – 199,999           | Base Rate       |
+| 200,000 – 499,999     | Base Rate -10%  |
+| 500,000 – 999,999     | Base Rate -20%  |
+| 1,000,000+            | Base Rate -30%  |
 ===== Purchasing or Renting Private Compute Nodes =====
@@ Line 106: / Line 121: @@
   * **Shared Integration**: The compute node is added to the corresponding shared partition. Other users may utilize it when the owning group is not using it. For details, refer to the [[hpc/slurm#partitions|partitions]] section.
-  * **Maximum Usage**: Research groups can utilize up to **60% of the node's maximum theoretical computational capacity**. This ensures fair access to shared resources. See [[hpc:hpc_clusters#usage_limit|Usage limit]]
+  * **Usage Limit**: Each research group may consume up to **60% of the theoretical usage credit associated with the compute node**. This policy ensures fair access to shared cluster resources. . See  the [[hpc:hpc_clusters#usage_limit|Usage limit]] policy for more details
-  * **Cost**: In addition to the base cost of the compute node, a **15% surcharge** is applied to cover operational expenses such as cables, racks, switches, and storage.
+  * **Cost**: In addition to the base cost of the compute node, a **15% surcharge** is applied to cover operational expenses such as cables, racks, switches, and storage (not yet valid).
   * **Ownership Period**: The compute node remains the property of the research group for **5 years**. After this period, the node may remain in production but will only be accessible via public and shared partitions.
   * **Warranty and Repairs**: Nodes come with a **3-year warranty**. If the node fails after this period, the research group is responsible for **100% of repair costs**. Repairing the node involves sending it to the vendor for diagnostics and a quote, with a maximum diagnostic fee of **420 CHF**, even if the node is irreparable.
@@ Line 193: / Line 208: @@
 We usually install and order the nodes twice per year.
-If you want to ask a financial contribution from UNIGE you must complete a COINF application : https://www.unige.ch/rectorat/commissions/coinf/appel-a-projets
+If you want to ask a financial contribution from UNIGE you must complete submit a request to the [[https://www.unige.ch/rectorat/commissions/coinf/appel-a-projets
+|COINF]].
 ====== Use Baobab for teaching ======
@@ Line 233: / Line 248: @@
 Both clusters contain a mix of "public" nodes provided by the University of Geneva, a "private" nodes in
-general paid 50% by the University and 50% by a research group for instance. Any user of the clusters can
+general funded 50% by the University through the [[https://www.unige.ch/rectorat/commissions/coinf/appel-a-projets
+|COINF]] and 50% by a research group for instance. Any user of the clusters can
 request compute resources on any node (public and private), but a research group who owns "private" nodes has
 a higher priority on its "private" nodes and can request a longer execution time.
@@ Line 287: / Line 303: @@
 === CPUs on Bamboo ===
-^ Generation ^ Model     ^ Freq    ^ Nb cores ^ Architecture               ^ Nodes                             ^ Memory             ^Extra flag    ^ Status            ^
+^ Generation ^ Model     ^ Freq    ^ Nb cores  ^ Architecture               ^ Nodes                             ^ Memory             ^Extra flag    ^ Status            ^
-| V8         | EPYC-7742 | 2.25GHz | 128 cores| "Rome" (7 nm)              | cpu[001-043,049-052],gpu[001-002] | 512GB              |              | on prod           |
+| V8         | EPYC-7742 | 2.25GHz | 128 cores | "Rome" (7 nm)              | cpu[001-043,049-052],gpu[001-002] | 512GB              |              | on prod           |
-| V8         | EPYC-7742 | 2.25GHz | 128 cores| "Rome" (7 nm)              | cpu[049-052]                      | 256GB              |              | on prod           |
+| V8         | EPYC-7742 | 2.25GHz | 128 cores | "Rome" (7 nm)              | cpu[049-052]                      | 256GB              |              | on prod           |
-| V8         | EPYC-7302P| 3.0GHz  | 16 cores | "Rome" (7 nm)              | gpu003                            | 512GB              |              | on prod           |
+| V8         | EPYC-7302P| 3.0GHz  | 16 cores  | "Rome" (7 nm)              | gpu003                            | 512GB              |              | on prod           |
-| V10        | EPYC-72F3 | 3.7GHz  | 16 cores | "Milan" (7 nm)             | cpu[044-045]                      | 1TB                |BIG_MEM       | on prod           |
+| V10        | EPYC-72F3 | 3.7GHz  | 16 cores  | "Milan" (7 nm)             | cpu[044-045]                      | 1TB                |BIG_MEM       | on prod           |
-| V10        | EPYC-7763 | 2.45GHz | 128 cores| "Milan" (7 nm)             | cpu[046-048]                      | 512GB              |              | on prod           |
+| V10        | EPYC-7763 | 2.45GHz | 128 cores | "Milan" (7 nm)             | cpu[046-048]                      | 512GB              |              | on prod           |
-| V11        | EPYC-9554 | 3.10GHz | 64 cores | "Genoa" (5 nm)             | gpu[004-005]                            | 768GB              |              | on prod           |
+| V11        | EPYC-9554 | 3.10GHz | 64 cores  | "Genoa" (5 nm)             | gpu[008]                          | 768GB              |              | on prod           |
+| V11        | EPYC-9554 | 3.10GHz | 128 cores | "Genoa" (5 nm)             | gpu[004-005]                      | 768GB              |              | on prod           |
+| V12        | EPYC-9654 | 3.70GHz | 96 cores  | "Genoa" (5 nm)             | gpu[006]                          | 768GB              |              | on prod           |
+| V13        | EPYC-9754 | 3.70GHz | 128 cores | "Genoa" (5 nm)             | gpu[007]                          | 768GB              |              | on prod           |
 === GPUs on Bamboo ===
-^ GPU model   ^ Architecture ^ Mem  ^ Compute Capability ^ Slurm resource ^ Nb per node ^ Nodes            ^ Peer access between GPUs ^
+^ GPU model              ^ Architecture ^ Mem   ^ Compute Capability ^ Slurm resource                ^ Nb per node ^ Nodes            ^ Peer access between GPUs ^
-| RTX 3090    | Ampere       | 25GB | 8.6                | ampere         | 8           | gpu[001,002]     | NO                       |
+| RTX 3090               | Ampere       | 25GB  | 8.6                | nvidia_geforce_rtx_3090       | 8           | gpu[001,002]     | NO                       |
-| A100        | Ampere       | 80GB | 8.0                | ampere         | 4           | gpu[003]         | YES                      |
+| A100                   | Ampere       | 80GB  | 8.0                | nvidia_a100_80gb_pcie         | 4           | gpu[003]         | YES                      |
-| H100        | Hopper       | 94GB | 9.0                | hopper         | 1           | gpu[004]         | NO                       |
+| H100                   | Hopper       | 94GB  | 9.0                | nvidia_h100_nvl               | 1           | gpu[004]         | NO                       |
-| H200        | Hopper       | 144GB | 9.0                | hopper         | 1           | gpu[005]         | NO                       |
+| H200                   | Hopper       | 144GB | 9.0                | nvidia_h200_nvl               | 4           | gpu[005]         | NO                       |
+| H200                   | Hopper       | 144GB | 9.0                | nvidia_h200_nvl               | 4           | gpu[006]         | YES                      |
+| RTX Pro 6000 Blackwell | Blackwell    | 97GB  | 9.0                | nvidia_rtx_pro_6000_blackwell | 4           | gpu[008]         | NO                       |
 ==== Baobab ====
@@ Line 308: / Line 329: @@
 Since our clusters are regularly expanded, the nodes are not all from the same generation. You can see the details in the following table.
-^ Generation ^ Model     ^ Freq    ^ Nb cores ^ Architecture               ^ Nodes                                             ^Extra flag      ^ Status                       |
+^ Generation ^ Model        ^ Freq    ^ Nb cores ^ Architecture               ^ Nodes                                             ^Extra flag      ^ Status                       |
-| V2         | X5650     | 2.67GHz | 12 cores | "Westmere-EP" (32 nm)      | cpu[093-101,103-111,140-153                       |                | decommissioned               |
+| V5         | E5-2643V3    | 3.40GHz | 12 cores | "Haswell-EP" (22 nm)       | gpu[002]                                          |                | on prod                      |
-| V3         | E5-2660V0 | 2.20GHz | 16 cores | "Sandy Bridge-EP" (32 nm)  | cpu[009-010,012-018,020-025,029-044]              |                | decommissioned in 2023       |
+| V6         | E5-2630V4    | 2.20GHz | 20 cores | "Broadwell-EP" (14 nm)     | cpu[173-185,187-201,205-213,220-229,237-264],gpu[004-009]|         | on prod                      |
-| V3         | E5-2660V0 | 2.20GHz | 16 cores | "Sandy Bridge-EP" (32 nm)  | cpu[011,019,026-028,042]                          |                | decommissioned in 2024       |
+| V6         | E5-2637V4    | 3.50GHz | 8 cores  | "Broadwell-EP" (14 nm)     | cpu[218-219]                                      | HIGH_FREQUENCY | on prod                      |
-| V3         | E5-2660V0 | 2.20GHz | 16 cores | "Sandy Bridge-EP" (32 nm)  | cpu[001-005,007-008,045-056,058]                  |                | decommissioned in 2024       |
+| V6         | E5-2643V4    | 3.40GHz | 12 cores | "Broadwell-EP" (14 nm)     | cpu[202,216-217]                                  | HIGH_FREQUENCY | on prod                      |
-| V3         | E5-2670V0 | 2.60GHz | 16 cores | "Sandy Bridge-EP" (32 nm)  | cpu[059,061-062]                                  |                | decommissioned in 2024       |
+| V6         | E5-2680V4    | 2.40GHz | 28 cores | "Broadwell-EP" (14 nm)     | gpu[012]                                          |                | on prod                      |
-| V3         | E5-4640V0 | 2.40GHz | 32 cores | "Sandy Bridge-EP" (32 nm)  | cpu[186]                                          |                | decommissioned in 2024       |
+| V7         | EPYC-7601    | 2.20GHz | 64 cores | "Naples" (14 nm)           | gpu[011]                                          |                | on prod                      |
-| V4         | E5-2650V2 | 2.60GHz | 16 cores | "Ivy Bridge-EP" (22 nm)    | cpu[063-066,154-172]                              |                | decommissioned in 2025 |
+| V8         | EPYC-7742    | 2.25GHz | 128 cores| "Rome" (7 nm)              | cpu[273-277,285-307,312-335],gpu[013-046]         |                | on prod                      |
-| V5         | E5-2643V3 | 3.40GHz | 12 cores | "Haswell-EP" (22 nm)       | gpu[002]                                          |                | on prod                      |
+| V9         | SILVER-4210R | 2.60GHz | 36 cores | "Cascade Lake" (14 nm)     | gpu010                                            |                | on prod                      |
-| V6         | E5-2630V4 | 2.20GHz | 20 cores | "Broadwell-EP" (14 nm)     | cpu[173-185,187-201,205-213,220-229,237-264],gpu[004-010]         |                | on prod                      |
+| V9         | GOLD-6240    | 2.60GHz | 36 cores | "Cascade Lake" (14 nm)     | cpu[084-090,265-272,278-284,308-311,336-349]      |                | on prod                      |
-| V6         | E5-2637V4 | 3.50GHz | 8 cores  | "Broadwell-EP" (14 nm)     | cpu[218-219]                                      | HIGH_FREQUENCY | on prod                      |
+| V9	     | GOLD-6244    | 3.60GHz | 16 cores | "Intel Xeon Gold 6244 CPU" | cpu[351]                                          |                |                              |
-| V6         | E5-2643V4 | 3.40GHz | 12 cores | "Broadwell-EP" (14 nm)     | cpu[202,216-217]                              | HIGH_FREQUENCY | on prod                      |
+| V10        | EPYC-7763    | 2.45GHz | 128 cores| "Milan" (7 nm)             | cpu[001],gpu[047,048]                             |                | on prod                      |
-| V6         | E5-2680V4 | 2.40GHz | 28 cores | "Broadwell-EP" (14 nm)     | gpu[012]                                 |                | on prod                      |
+| V11        | EPYC-9554    | 3.10GHz | 128 cores| "Genoa" (5 nm)             | gpu[049]                                          |                | on prod                      |
-| V7         | EPYC-7601 | 2.20GHz | 64 cores | "Naples" (14 nm)           | gpu[011]                                          |                | on prod                      |
+| V12        | EPYC-9654    | 3.70GHz | 192 cores| "Genoa" (5 nm)             | cpu[350]                                          |                | on prod                      |
-| V8         | EPYC-7742 | 2.25GHz | 128 cores| "Rome" (7 nm)              | cpu[273-277,285-307,312-335],gpu[013-046]         |                | on prod                      |
+| V12        | EPYC-9654    | 3.70GHz | 96 cores | "Genoa" (5 nm)             | gpu[050]                                          |                | on prod                      |
-| V9         | GOLD-6240 | 2.60GHz | 36 cores | "Cascade Lake" (14 nm)     | cpu[084-090,265-272,278-284,308-311,336-349]      |                | on prod                      |
-| V9	     | GOLD-6244 | 3.60GHz | 16 cores |	“Intel Xeon Gold 6244 CPU” | cpu[351]                                          |                |                              |
-| V10        | EPYC-7763 | 2.45GHz | 128 cores| "Milan" (7 nm)             | cpu[001],gpu[047,048]                                      |                | on prod                      |
-| V11        | EPYC-9554 | 3.10GHz | 128 cores| "Genoa" (5 nm)             | gpu[049]                                          |                | on prod                      |
-| V11        | EPYC-9654 | 3.70GHz | 96 cores | "Genoa" (5 nm)             | cpu[350],gpu[050]                                 |                | on prod                      |
 The "generation" column is just a way to classify the nodes on our clusters. In the following table you can see the features of each architecture.
@@ Line 356: / Line 372: @@
 | Titan X     | Pascal       | 12GB  | 6.1               | nvidia_titan_x             | titan                | 8         | gpu[009-010]     |
 | RTX 2080 Ti | Turing       | 11GB  | 7.5               | nvidia_geforce_rtx_2080_ti | turing               | 2         | gpu[011]         |
-| RTX 2080 Ti | Turing       | 11GB  | 7.5               | nvidia_geforce_rtx_2080_ti | turing               | 8         | gpu[012,015]     |
+| RTX 2080 Ti | Turing       | 11GB  | 7.5               | nvidia_geforce_rtx_2080_ti | turing               | 8         | gpu[015]         |
-| RTX 2080 Ti | Turing       | 11GB  | 7.5               |                            | turing               | 8         | gpu[013,016]     |
+| RTX 2080 Ti | Turing       | 11GB  | 7.5               | nvidia_geforce_rtx_2080_ti | turing               | 8         | gpu[013,016]     |
 | RTX 2080 Ti | Turing       | 11GB  | 7.5               | nvidia_geforce_rtx_2080_ti | turing               | 4         | gpu[018-019]     |
 | RTX 3090    | Ampere       | 25GB  | 8.6               | nvidia_geforce_rtx_3090    | ampere               | 8         | gpu[025]                 |
@@ Line 386: / Line 402: @@
 Since our clusters are regularly expanded, the nodes are not all from the same generation. You can see the details in the following table.
-^ Generation ^ Model     ^ Freq    ^ Nb cores ^ Architecture               ^ Nodes                        ^Extra flag    ^
+^ Generation ^ Model                                                                                                                                 ^ Freq    ^ Nb cores ^ Architecture               ^ Nodes                        ^ Extra flag    ^
-| V9 | [[https://ark.intel.com/content/www/fr/fr/ark/products/192443/intel-xeon-gold-6240-processor-24-75m-cache-2-60-ghz.html|GOLD-6240]]   | 2.60GHz | 36 cores  | “Cascade Lake” (14 nm)    | cpu[001-083,091-097,120-122]             |              |
+| V9         | [[https://ark.intel.com/content/www/fr/fr/ark/products/192443/intel-xeon-gold-6240-processor-24-75m-cache-2-60-ghz.html|GOLD-6240]]   | 2.60GHz | 36 cores  | “Cascade Lake” (14 nm)    | cpu[001-083,091-097,120-122]  |              |
-| V9 | [[https://ark.intel.com/content/www/us/en/ark/products/192442/intel-xeon-gold-6244-processor-24-75m-cache-3-60-ghz.html|GOLD-6244]]   | 3.60GHz | 16 cores  | “Cascade Lake” (14 nm)    | cpu[112-115]             |              |
+| V9         | [[https://ark.intel.com/content/www/us/en/ark/products/192442/intel-xeon-gold-6244-processor-24-75m-cache-3-60-ghz.html|GOLD-6244]]   | 3.60GHz | 16 cores  | “Cascade Lake” (14 nm)    | cpu[112-115]                  |              |
-| V8 | EPYC-7742    | 2.25GHz | 128 cores | "Rome (7 nm) "      | cpu[123-150]             |              |
+| V8         | EPYC-7742                                                                                                                             | 2.25GHz | 128 cores | "Rome (7 nm) "            | cpu[123-150]                  |              |
-| V9 | [[https://ark.intel.com/content/www/fr/fr/ark/products/193390/intel-xeon-silver-4208-processor-11m-cache-2-10-ghz.html|SILVER-4208]] | 2.10GHz | 16 cores  | “Cascade Lake” (14 nm)    | gpu[001-006,008]         |              |
+| V9         | [[https://ark.intel.com/content/www/fr/fr/ark/products/193390/intel-xeon-silver-4208-processor-11m-cache-2-10-ghz.html|SILVER-4208]]  | 2.10GHz | 16 cores  | “Cascade Lake” (14 nm)    | gpu[001-006,008]              |              |
-| V9 | [[https://ark.intel.com/content/www/us/en/ark/products/193954/intel-xeon-gold-6234-processor-24-75m-cache-3-30-ghz.html|GOLD-6234]]   | 3.30GHz | 16 cores  | “Cascade Lake” (14 nm)    | gpu[007]                  |              |
+| V9         | [[https://ark.intel.com/content/www/us/en/ark/products/193954/intel-xeon-gold-6234-processor-24-75m-cache-3-30-ghz.html|GOLD-6234]]   | 3.30GHz | 16 cores  | “Cascade Lake” (14 nm)    | gpu[007]                      |              |
+| V12        | EPYC-9654                                                                                                                             | 3.70GHz | 192 cores | “Genoa” (5 nm)            | cpu[159-164]                  |              |
 The "generation" column is just a way to classify the nodes on our clusters. In the following table you can see the features of each architecture.