User Tools

Site Tools


hpc:storage_on_hpc

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
hpc:storage_on_hpc [2023/06/13 15:32]
Yann Sagon [Scratch directory]
hpc:storage_on_hpc [2024/05/02 13:51] (current)
Gaël Rossignol [Cluster storage]
Line 12: Line 12:
 This is the storage space we offer on our clusters This is the storage space we offer on our clusters
  
-^ Cluster   BeeGFS path              ^ Total storage size  ^ Nb of servers ^ Nb of targets per servers  ^ Backup     ^ +^ Cluster   Path                     ^ Total storage size  ^ Nb of servers ^ Nb of targets per servers  ^ Backup     ^ Quota size         ^ Quota number files 
-| Baobab    | ''/home/''               | 138 TB              | 4             | 1 meta, 2 storage          | Yes (tape) | +| Baobab    | ''/home/''               | 138 TB              | 4             | 1 meta, 2 storage          | Yes (tape) | 1 TB               | -                  
-| :::       | ''/srv/beegfs/scratch/'' | 1.0 PB              | 2             | 1 meta, 6 storage          | No         | +| :::       | ''/srv/beegfs/scratch/'' | 1.0 PB              | 2             | 1 meta, 6 storage          | No         | -                  | 10 M               | 
-| Yggdrasil | ''/home/''               | 495 TB              | 2             | 1 meta, 2 storage          | Yes (tape) | +| :::       | ''/srv/fast''            | 5 TB                | 1             | 1                          | No         | 500G/User 1T/Group | -               
-| :::       | ''/srv/beegfs/scratch/'' | 1.2 PB              | 2             | 1 meta, 6 storage          | No         |+| Yggdrasil | ''/home/''               | 495 TB              | 2             | 1 meta, 2 storage          | Yes (tape) | 1 TB               | -                  
 +| :::       | ''/srv/beegfs/scratch/'' | 1.2 PB              | 2             | 1 meta, 6 storage          | No         | -                  | 10 M               |
  
-We realize you all have different needs in terms of storage. To guarantee storage space for all users, we have set a quota of 1 TB per user on home beegfsbeyond this limit, you will not able to write to this filesystem. We count on all of you to only store research data on the clusters. We also count on your **to periodically delete old or unneeded files** and to **clean up everything when you will leave UNIGE**. Please keep on reading to understand when you should use each type of storage.+We realize you all have different needs in terms of storage. To guarantee storage space for all users, we have **set a quota on home and scratch directory**see table above for details. Beyond this limit, you will not able to write to this filesystem. We count on all of you to only store research data on the clusters. We also count on your **to periodically delete old or unneeded files** and to **clean up everything when you will leave UNIGE**. Please keep on reading to understand when you should use each type of storage.
  
  
Line 75: Line 76:
 **The maximum file count is currently set to 10M.** **The maximum file count is currently set to 10M.**
  
-What does it mean for you: if your home directory usage is higher than 1TB, you won’t be able to write to it anymore.+What does it mean for you: if the number of files in your scratch space is higher than 10M, you won’t be able to write to it anymore.
  
 Error message: Error message:
Line 83: Line 84:
 To resume the situation, you should clean up some data in your scratch directory. To resume the situation, you should clean up some data in your scratch directory.
  
 +===== Fast directory =====
 +
 +A new fast storage is available dedicated for jobs using multiples nodes and scratchlocal need to be shared between nodes.
 +
 +^ Cluster   ^ path              ^ Total storage size  ^ Nb of servers ^ Backup     ^ Quota size ^ Quota number files ^
 +| Baobab    | ''/srv/fast''     | 5 TB                | 1             | No         | 500G by user & 1TB by group       | -                  |
 +
 +<note important>This storage will be erased on each maintenances.</note>
 +
 +==== Quota ====
 +
 +
 +As the storage is shared by everyone, this ensure a fair scratch usage and prevent users from filling it. We setup a quota based on the total size.
 +
 +You should clean up some data in your fast directory as soon as your jobs are finished.
  
 ====== Local storage ====== ====== Local storage ======
Line 101: Line 117:
 </note> </note>
  
-===== temporary space =====+===== Temporary private space =====
  
 On **each** compute node, you can use the following private ephemeral spaces: On **each** compute node, you can use the following private ephemeral spaces:
Line 110: Line 126:
 Those places are private and only accessible by your job. Those places are private and only accessible by your job.
  
 +===== Temporary shared space =====
 +If you need to access the data from more than one node, you can use a space reachable from all your jobs running on the same compute node. When you have no more jobs running on the node, the content of the storage is erased.
  
 +The path is the following: ''/share/users/${SLURM_JOB_USER:0:1}/${SLURM_JOB_USER}''
  
 +See here for a usage example: https://hpc-community.unige.ch/t/local-share-directory-beetween-jobs-on-compute/2893
 ====== Sharing files with other users ====== ====== Sharing files with other users ======
  
Line 134: Line 154:
 For easy sharing you need to set ''umask 0002'' (thus new files and directories will be created with 660 and 770 permissions, respectively), otherwise you will be asked for confirmation every time you want to modify a file or, even worse, you will not be able to create new files/folders. For easy sharing you need to set ''umask 0002'' (thus new files and directories will be created with 660 and 770 permissions, respectively), otherwise you will be asked for confirmation every time you want to modify a file or, even worse, you will not be able to create new files/folders.
  
-This is a side-effect of the default permissions on Red Hat-based systems without **User Private Groupes** (//i.e.// when the UID/GID differs, as it is the case on Baobab/Yggdrasil (( https://hpc-community.unige.ch/t/some-directory-permission-seem-to-change-on-their-own/1050/2 )) ).+This is a side-effect of the default permissions on Red Hat-based systems without **User Private Groups** (//i.e.// when the UID/GID differs, as it is the case on Baobab/Yggdrasil (( https://hpc-community.unige.ch/t/some-directory-permission-seem-to-change-on-their-own/1050/2 )) ). 
 +</note> 
 +<note info> 
 +Since we use ACL to set the user right, you can't rely on sticky bit to force the new files to belong to a group which is not your primary group. You have the following options: 
 +  * You can request to change your primary group: every file that you create on the cluster will belong to this group 
 +  * You can set your umask to 0002 as explained previously 
 +  * You can launch on a regular basis as script that "fix" the group. Example: ''find . -type f -exec chown :share_xxx {} \;''
 </note> </note>
 ====== Best practices ====== ====== Best practices ======
Line 170: Line 196:
 ===== Check disk usage on the clusters ===== ===== Check disk usage on the clusters =====
  
-Since ''/home'' and ''/srv/beegfs/scratch/'' have quota enabled and enforced (only on home), we can quickly check the disk usage by fetching the quota information.+==== Check disk usage on home and scratch ==== 
 + 
 + 
 +Since ''/home'' and ''/srv/beegfs/scratch/'' have quota enabled and enforced, we can quickly check the disk usage by fetching the quota information.
  
 The script ''beegfs-get-quota-home-scratch.sh'' gives you a quick summary : The script ''beegfs-get-quota-home-scratch.sh'' gives you a quick summary :
  
 <code console> <code console>
-[brero@login2 ~]$ beegfs-get-quota-home-scratch.sh brero +(baobab)-[sagon@login2 ~]$ beegfs-get-quota-home-scratch.sh 
-       USER >     /home    /srv/beegfs/scratch +home dir: /home/sagon 
-      brero >    1.04 GiB  |    4.00 KiB+scratch dir: /srv/beegfs/scratch/users/s/sagon 
 + 
 +        user/group                 ||           size          ||    chunk files 
 +storage       name        |  id  ||    used    |    hard    ||  used    hard 
 +----------------------------|------||------------|------------||---------|--------- 
 +home        |          sagon|240477||  530.46 GiB| 1024.00 GiB||  1158225|unlimited 
 +scratch              sagon|240477||    2.74 TiB|   unlimited||   436030| 10000000
 </code> </code>
  
-N.B. This includes all your data in ''$HOME'', ''$HOME/scratch'', but also any data in ''/home/share'' and ''/srv/beegfs/scratch/shares'' that belongs to you (if you are using a shared directory).+<WRAP center round tip 60%> 
 +This includes all your data in ''$HOME'', ''$HOME/scratch'', but also any data in ''/home/share'' and ''/srv/beegfs/scratch/shares'' that belongs to you (if you are using a shared directory). 
 +</WRAP> 
 + 
 +<note>The column "chunk files" doesn't correspond exactly to the number of files you own. It corresponds to the number of chunks you own. Each file has at least one chunk. The current chunk size is 512kB. If a file is bigger, it will be split.</note> 
 + 
 +==== Check disk usage on NASAC ====
  
 If you have space as well in ''/acanas'' (The NASAC) you can check your quota and usage like this: If you have space as well in ''/acanas'' (The NASAC) you can check your quota and usage like this:
Line 311: Line 352:
  
 reference: (([[https://hpc-community.unige.ch/t/howto-access-external-storage-from-baobab/551|How to access external storage from Baobab]])) reference: (([[https://hpc-community.unige.ch/t/howto-access-external-storage-from-baobab/551|How to access external storage from Baobab]]))
-===== CVMFS =====  +===== CVMFS ===== 
-The following cvmfs content is mounted on the compute nodes and login nodes on '/cvmfs'+All the compute nodes of our clusters have CernVM-FS client installed. CernVM-FS, the CernVM File System (also known as CVMFS), is a file distribution service that is particularly well suited to distribute software installations across a large number of systems world-wide in an efficient way. 
 + 
 + 
 + 
 +A couple of repository are mounted on the compute and login node such as:
  
   * atlas.cern.ch   * atlas.cern.ch
Line 320: Line 365:
   * grid.cern.ch   * grid.cern.ch
  
-The content is mounted using autofs. It means that the root directory `/cvmfsmay appears empty as long as you+ 
 +The content is mounted using autofs under the path ''/cvmfs''. It means that the root directory ''/cvmfs'' may appears empty as long as you
 didn't access explicitly one of the child directory. Doing so will mount the repository for a couple of didn't access explicitly one of the child directory. Doing so will mount the repository for a couple of
 minutes and unmount it automatically. minutes and unmount it automatically.
 +
 +Other flaghship repository available without further configuration:
 +
 +  * unpacked.cern.ch 
 +  * singularity.opensciencegrid.org (container registry)
 +  * software.eessi.io (
  
 <code> <code>
Line 331: Line 383:
 cvmfs-config.cern.ch  grid.cern.ch cvmfs-config.cern.ch  grid.cern.ch
 </code> </code>
 +
 +The EESSI did a nice tutorial about CVMFS readable on [[https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/|multixscale]] git repo.
 +
  
 ====== Robinhood ====== ====== Robinhood ======
hpc/storage_on_hpc.1686663140.txt.gz · Last modified: 2023/06/13 15:32 by Yann Sagon