User Tools

Site Tools


hpc:storage_on_hpc

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
hpc:storage_on_hpc [2024/12/09 11:10] – [NASAC] Adrien Alberthpc:storage_on_hpc [2025/03/14 10:45] (current) – [NASAC] Gaël Rossignol
Line 51: Line 51:
  To resume the situation, you should clean up some data in your home directory and or migrate some data to your scratch directory.  To resume the situation, you should clean up some data in your home directory and or migrate some data to your scratch directory.
  
-===== Scratch directory =====+===== Scratch Directory =====
  
-Your ''scratch'' folder is located here: ''$HOME/scratch''.\\ +**Location and Accessibility:** 
-\\ It is available on the login node and on each compute nodes of the cluster. It has more space than ''$HOME''.+  
 +Your scratch directory is located at: ''$HOME/scratch''.
  
-**N.B.** : ''$HOME/scratch'' is a symbolic link to your user's folder in ''/srv/beegfs/scratch/''.+  It is available on both the login node and all compute nodes of the cluster. 
 +  It offers more storage space than ''$HOME''
 +  * It is not backed up
  
-The scratch directory allows you to store any data that is not unique or that can be regenerated. Please use this to store any file that doesn't need a backup. You will typically use it as a storage when your application writes temporary data to disk. We thank you for your cooperation.  
  
-It is also acceptable to store for instance large dataset that you use as an input data during your computation.+**N.B.: ''$HOME/scratch'' is a symbolic link to your personal folder in ''/srv/beegfs/scratch/''.**
  
-The content of this folder is persistent and accessible from any node, but there is no backup. 
  
-Your ''$HOME/scratch'' is only accessible by you (permission ''0700'') ; you are not allowed to change the permissions and if you do, they are automatically reset every day. If you need to share files, please check [[hpc:storage_on_hpc#sharing_files_with_other_users|Sharing files with other users]]. 
  
 +**Purpose of the Scratch Directory:**
  
-Alsothe scratch directory is not a permanent storage solution, we strongly advise you to move/clean useless/unused data after your project. +The scratch directory is intended for storing non-unique or regenerable data. You should use it for: 
 + 
 +  * Temporary storage of application-generated files. 
 +  * Large datasets used as input during computations. 
 +  * Howeverthis directory is not backed up. Please avoid storing critical data here. 
 + 
 +**Permissions and Access Control:**  
 + 
 +  * Your ''$HOME/scratch'' is **only** accessible by you (permission ''0700''). 
 +  * Permission modifications are not allowed and will be **automatically** reset. 
 +  * If you need to share files, refer to: [[hpc:storage_on_hpc#sharing_files_with_other_users|Sharing files with other users]]. 
 + 
 +**Best Practices:** 
 + 
 +The scratch directory is **not a permanent** storage solution. To ensure efficient use: 
 + 
 +  *  Regularly clean up unnecessary files. 
 +  *  Move important data elsewhere once your project is completedUnige provides [[https://www.unige.ch/researchdata/en/store/storage-unige/|Storage solution]] which can be mounted on Clusters 
  
 ==== Quota ==== ==== Quota ====
 +Since the scratch storage is shared among all users, a file count quota is enforced to ensure fair usage:
  
 +  * Maximum file count: 10 million (10M)
  
-As the storage is shared by everyone, this ensure a fair scratch usage and prevent users from filling itWe setup a quota based on the number of files you ownnot the file size.+If you exceed this limit, you won’t be able to write any new files. 
 + 
 + 
 +==== Data Retention Policy ==== 
 +<note> 
 +Important: The data retention policy will be implemented on Baobab during the next maintenance from February 18 to 212025. 
 +[[https://hpc-community.unige.ch/t/important-boabab-new-data-retention-policy-for-scratch-filesystem/3813|More Info]] 
 +</note>
  
-**The maximum file count is currently set to 10M.**+**Automatic Deletion Rules:**
  
-What does it mean for you: if the number of files in your scratch space is higher than 10M, you won’t be able to write to it anymore.+  * Files **older than 3 months** will be automatically deleted. 
 +  * Deletion is based on the last access (read or writte) date of each file.
  
-Error message:+**What This Means for You:**
  
-Disk quota exceeded+  * Any file not accessed within the last 3 months will be considered inactive and deleted. 
 +  * Frequently used files will remain unaffected.
  
-To resume the situation, you should clean up some data in your scratch directory.+By following these guidelines, you can ensure efficient use of the scratch storage while avoiding data loss.
  
 ===== Fast directory ===== ===== Fast directory =====
Line 140: Line 169:
   * in "scratch" (''/srv/beegfs/scratch/shares/'') - to share datasets for instance    * in "scratch" (''/srv/beegfs/scratch/shares/'') - to share datasets for instance 
  
-If you need one, please contact us by email : +If you need one, please fill the form on DW: https://dw.unige.ch/openentry.html?tid=hpc  
-  * send the email to [[hpc@unige.ch]] with your PI in cc + 
-  * provide the following : +If you are an Outisder user and you don't have access to DW,  please request to your PI to fill the form.
-    * do you need a "home" or "scratch" shared folder (or both) ? +
-    * the list of people (email and username) who needs to access the shared folder +
-    * the desired name of the shared folder+
  
 <note important> <note important>
Line 204: Line 230:
  
 <code console> <code console>
-(baobab)-[sagon@login2 ~]$ beegfs-get-quota-home-scratch.sh+(baobab)-[sagon@login1 ~]$ beegfs-get-quota-home-scratch.sh
 home dir: /home/sagon home dir: /home/sagon
 scratch dir: /srv/beegfs/scratch/users/s/sagon scratch dir: /srv/beegfs/scratch/users/s/sagon
Line 277: Line 303:
 ===== NASAC  ===== ===== NASAC  =====
  
-<WRAP round alert 50%> 
-GIFS is not working due to a dummy patch integrated by the fast network supplier (Infiniband by Mellanox/Nvidia). 
  
-2024-12-05 Update: Since Rocky9 was deployed on Bamboo during the last maintenance, the module now seems to be available again. You should be able to mount your NASAC share on both the login nodes and compute nodes. 
- 
-Rocky9 will also be deployed on Baobab and Yggdrasil during the next maintenances. 
- 
-For more information: https://hpc-community.unige.ch/t/2024-current-issues-on-hpc-cluster/3245/15 
-</WRAP> 
  
 If you need to mount an external share (NAS for example) on Baobab from command line, you can proceed as  If you need to mount an external share (NAS for example) on Baobab from command line, you can proceed as 
Line 293: Line 311:
  
 <code console> <code console>
-[sagon@login2 ~] $ dbus-launch bash+[sagon@login1 ~] $ dbus-launch bash
 </code> </code>
  
Line 299: Line 317:
  
 <code console> <code console>
-[sagon@login2 ~] $ gio mount smb://server_name/share_name+[sagon@login1 ~] $ gio mount smb://server_name/share_name
 </code> </code>
  
Line 315: Line 333:
  
 <code console> <code console>
-[sagon@login2 ~] $ gio mount -u smb://server_name/share_name+[sagon@login1 ~] $ gio mount -u smb://server_name/share_name
 </code> </code>
  
-<note important>The data are only available on the login2 node.  +<note important>The data are only available where gio has been mounted.  
-If you need to access the data on the nodes, you need to mount them there as well in your sbatch script.</note>+If you need to access the on other nodes, you need to mount them there as well in your sbatch script.</note>
  
 If you need to script this, you can put your credentials in a file in your home directory. If you need to script this, you can put your credentials in a file in your home directory.
Line 332: Line 350:
 Mount example using credentials in a script: Mount example using credentials in a script:
 <code console> <code console>
-[sagon@login2 ~] $ gio mount smb://server_name/share_name < .credentials+[sagon@login1 ~] $ gio mount smb://server_name/share_name < .credentials
 </code> </code>
  
Line 346: Line 364:
  
 <code console> <code console>
-[sagon@login2 ~] $ ps ux | grep -e '[g]vfsd-fuse'+[sagon@login1 ~] $ ps ux | grep -e '[g]vfsd-fuse'
 sagon    196919  0.0  0.0 387104  3376 ?        Sl   08:49   0:00 /usr/libexec/gvfsd-fuse /home/sagon/.gvfs -f -o big_writes sagon    196919  0.0  0.0 387104  3376 ?        Sl   08:49   0:00 /usr/libexec/gvfsd-fuse /home/sagon/.gvfs -f -o big_writes
 </code> </code>
Line 355: Line 373:
  
 <code console> <code console>
-[sagon@login2 ~] $ pgrep -a -U $(id -u) dbus+[sagon@login1 ~] $ pgrep -a -U $(id -u) dbus
 196761 /usr/bin/dbus-daemon --fork --print-pid 4 --print-address 6 --session 196761 /usr/bin/dbus-daemon --fork --print-pid 4 --print-address 6 --session
 224317 /usr/bin/dbus-daemon --fork --print-pid 4 --print-address 6 --session 224317 /usr/bin/dbus-daemon --fork --print-pid 4 --print-address 6 --session
Line 399: Line 417:
 ====== Robinhood ====== ====== Robinhood ======
 Robinhood Policy Engine is a versatile tool to manage contents of large file systems. It daily scans the scratch beegfs filesystems. It makes it possible to schedule mass action on filesystem entries by defining attribute-based policies. Robinhood Policy Engine is a versatile tool to manage contents of large file systems. It daily scans the scratch beegfs filesystems. It makes it possible to schedule mass action on filesystem entries by defining attribute-based policies.
 +<WRAP center round important 60%>
 +We are working on the newer functionality needed to enforce our scratch data retention policy, the report are out of date until further notice
 +</WRAP>
  
 ==== Policies ==== ==== Policies ====
hpc/storage_on_hpc.1733742600.txt.gz · Last modified: 2024/12/09 11:10 by Adrien Albert