User Tools

Site Tools


hpc:accounting

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
hpc:accounting [2025/12/04 10:22] – [Resource accounting uniformization] Yann Sagonhpc:accounting [2025/12/04 10:43] (current) – [Resources available for research group] Yann Sagon
Line 15: Line 15:
 We apply uniform resource accounting by converting GPU hours and memory usage into CPU-hour equivalents, using the [[https://slurm.schedmd.com/tres.html|TRESBillingWeights]] feature provided by SLURM. We apply uniform resource accounting by converting GPU hours and memory usage into CPU-hour equivalents, using the [[https://slurm.schedmd.com/tres.html|TRESBillingWeights]] feature provided by SLURM.
 A CPU hour represents one hour of processing time on a single CPU core. A CPU hour represents one hour of processing time on a single CPU core.
 +
 We use this model because our cluster is heterogeneous, and both the computational power and the cost of GPUs vary significantly depending on the model. To ensure fairness and transparency, each GPU type is assigned a weight that reflects its relative performance compared to a CPU core. Similarly, memory usage is converted into CPU-hour equivalents based on predefined weights. We use this model because our cluster is heterogeneous, and both the computational power and the cost of GPUs vary significantly depending on the model. To ensure fairness and transparency, each GPU type is assigned a weight that reflects its relative performance compared to a CPU core. Similarly, memory usage is converted into CPU-hour equivalents based on predefined weights.
 +
 We also bill memory usage because some jobs consume very little CPU but require large amounts of memory, which means an entire compute node is occupied. This ensures that jobs using significant memory resources are accounted for fairly. We also bill memory usage because some jobs consume very little CPU but require large amounts of memory, which means an entire compute node is occupied. This ensures that jobs using significant memory resources are accounted for fairly.
 +
 Example: A job using a GPU with a weight of 10 for 2 hours and memory equivalent to 5 CPU hours would be billed as 25 CPU hours. This approach guarantees consistent, transparent, and fair resource accounting across all heterogeneous components of the cluster. Example: A job using a GPU with a weight of 10 for 2 hours and memory equivalent to 5 CPU hours would be billed as 25 CPU hours. This approach guarantees consistent, transparent, and fair resource accounting across all heterogeneous components of the cluster.
-You can check the conversion details by inspecting the parameters of any partition on the clusters. The same conversion table is applied everywhere.+ 
 +You can check the up to date conversion details by inspecting the parameters of any partition on the clusters. The same conversion table is applied everywhere.
  
 <code> <code>
Line 42: Line 46:
 ===== Resources available for research group ===== ===== Resources available for research group =====
  
-Research groups that have invested in the HPC cluster by purchasing private CPU or GPU nodes benefit from high priority access to these resources. +Research groups that have invested in the HPC cluster by purchasing private CPU or GPU nodes benefit from **high-priority access** to these resources
 + 
 +Although these nodes remain available to all users, owners receive **priority scheduling** and a predefined annual allocation of compute hours, referred to as [[accounting#resource_accounting_uniformization|billings]].   
 +The advantage of this approach is flexibility: you are free to use any resource on any cluster, rather than being restricted to your own nodes. When doing so, your billings will be consumed.
  
-While these nodes remain available to all usersowners receive priority scheduling and designated number of included compute hours per year+To view details of owned resourcesusers can run the script:   
 +''ug_getNodeCharacteristicsSummary.py'' 
 +This script provides summary of the node characteristics within the cluster.
  
-To check the details of their owned resources, users can run the script ''ug_getNodeCharacteristicsSummary.py''which provides a summary of the node characteristics within the cluster.+**Note:** This model ensures **fairness** across all users. Even if some groups own nodesresources remain shared. Usage beyond the included billings will be **charged according to the standard accounting model**, ensuring equitable access for everyone.
  
-Example:+Output example of the script:
 <code> <code>
 ug_getNodeCharacteristicsSummary.py --partitions private-<group>-gpu private-<group>-cpu --cluster <cluster> --summary ug_getNodeCharacteristicsSummary.py --partitions private-<group>-gpu private-<group>-cpu --cluster <cluster> --summary
Line 54: Line 63:
 ------  -----------  -----  -----  -----------  ------------  --------------------------  -----------  --------------  --------------------------------------  --------- ------  -----------  -----  -----  -----------  ------------  --------------------------  -----------  --------------  --------------------------------------  ---------
 cpu084  N-20.02.151     36    187            0                                                    0  2020-02-01                                                   79 cpu084  N-20.02.151     36    187            0                                                    0  2020-02-01                                                   79
-cpu085  N-20.02.152     36    187            0                                                    0  2020-02-01                                                   79 +[...]
-cpu086  N-20.02.153     36    187            0                                                    0  2020-02-01                                                   79 +
-cpu087  N-20.02.154     36    187            0                                                    0  2020-02-01                                                   79+
 cpu088  N-20.02.155     36    187            0                                                    0  2020-02-01                                                   79 cpu088  N-20.02.155     36    187            0                                                    0  2020-02-01                                                   79
-cpu089  N-20.02.156     36    187            0                                                    0  2020-02-01                                                   79 +[...]
-cpu090  N-20.02.157     36    187            0                                                    0  2020-02-01                                                   79 +
-cpu209  N-17.12.104     20     94            0                                                    0  2017-12-01                                                   41 +
-cpu210  N-17.12.105     20     94            0                                                    0  2017-12-01                                                   41 +
-cpu211  N-17.12.106     20     94            0                                                    0  2017-12-01                                                   41 +
-cpu212  N-17.12.107     20     94            0                                                    0  2017-12-01                                                   41 +
-cpu213  N-17.12.108     20     94            0                                                    0  2017-12-01                                                   41+
 cpu226  N-19.01.161     20     94            0                                                    0  2019-01-01                                                   41 cpu226  N-19.01.161     20     94            0                                                    0  2019-01-01                                                   41
-cpu227  N-19.01.162     20     94            0                                                    0  2019-01-01                                                   41 +[...]
-cpu228  N-19.01.163     20     94            0                                                    0  2019-01-01                                                   41+
 cpu229  N-19.01.164     20     94            0                                                    0  2019-01-01                                                   41 cpu229  N-19.01.164     20     94            0                                                    0  2019-01-01                                                   41
 cpu277  N-20.11.131    128    503            0                                                    0  2020-11-01                                          10        251 cpu277  N-20.11.131    128    503            0                                                    0  2020-11-01                                          10        251
hpc/accounting.1764843722.txt.gz · Last modified: by Yann Sagon