This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
hpc:storage_on_hpc [2023/12/05 17:55] Yann Sagon [CVMFS] |
hpc:storage_on_hpc [2024/05/02 13:51] (current) Gaël Rossignol [Cluster storage] |
||
---|---|---|---|
Line 12: | Line 12: | ||
This is the storage space we offer on our clusters | This is the storage space we offer on our clusters | ||
- | ^ Cluster | + | ^ Cluster |
- | | Baobab | + | | Baobab |
- | | ::: | ''/ | + | | ::: | ''/ |
- | | Yggdrasil | ''/ | + | | ::: | ''/ |
- | | ::: | ''/ | + | | Yggdrasil | ''/ |
+ | | ::: | ''/ | ||
We realize you all have different needs in terms of storage. To guarantee storage space for all users, we have **set a quota on home and scratch directory**, | We realize you all have different needs in terms of storage. To guarantee storage space for all users, we have **set a quota on home and scratch directory**, | ||
Line 83: | Line 84: | ||
To resume the situation, you should clean up some data in your scratch directory. | To resume the situation, you should clean up some data in your scratch directory. | ||
+ | ===== Fast directory ===== | ||
+ | |||
+ | A new fast storage is available dedicated for jobs using multiples nodes and scratchlocal need to be shared between nodes. | ||
+ | |||
+ | ^ Cluster | ||
+ | | Baobab | ||
+ | |||
+ | <note important> | ||
+ | |||
+ | ==== Quota ==== | ||
+ | |||
+ | |||
+ | As the storage is shared by everyone, this ensure a fair scratch usage and prevent users from filling it. We setup a quota based on the total size. | ||
+ | |||
+ | You should clean up some data in your fast directory as soon as your jobs are finished. | ||
====== Local storage ====== | ====== Local storage ====== | ||
Line 350: | Line 366: | ||
- | The content is mounted using autofs under the path ''/ | + | The content is mounted using autofs under the path ''/ |
didn't access explicitly one of the child directory. Doing so will mount the repository for a couple of | didn't access explicitly one of the child directory. Doing so will mount the repository for a couple of | ||
minutes and unmount it automatically. | minutes and unmount it automatically. | ||
Line 368: | Line 384: | ||
</ | </ | ||
- | We are using a squid proxy on app1.baobab to lower the required file transfer. | + | The EESSI did a nice tutorial |
- | + | ||
- | The EESSI did a nice tutorial readable on [[https:// | + | |
- | + | ||
- | ==== Requirements ==== | + | |
- | * At least two proxy servers, good for maintenance for example. | + | |
- | * 10Gbit link to client | + | |
- | * good amount of SSD disk | + | |
- | * Disable timeout on autofs: '' | + | |
- | * Cache on compute node: disk (recommended) or ram (for disk-less nodes) or shared-fs (disk-less with few GB of memory) | + | |
- | + | ||
- | + | ||
- | + | ||
- | ==== Troubleshoot ==== | + | |
- | + | ||
- | * '' | + | |
- | * '' | + | |
- | + | ||
- | Check if the proxy is responding: | + | |
- | < | + | |
- | nc -vz PROXY_IP 3128 | + | |
- | or | + | |
- | tcptraceroute PROXY_IP 3128 | + | |
- | or | + | |
- | curl --proxy app1:3128 --head url | + | |
- | </ | + | |
- | Example: | + | |
- | < | + | |
- | [root@admin1 ~]$ curl --proxy app1:3128 --head http:// | + | |
- | </ | + | |
- | + | ||
- | + | ||
- | See cache state (repository must be mounted previously): | + | |
- | < | + | |
- | (baobab)-[root@cpu002 ~]$ cvmfs_config stat -v software.eessi.io | + | |
- | Version: 2.10.1.0 | + | |
- | PID: 527813 | + | |
- | Uptime: 0 minutes | + | |
- | Memory Usage: 28744k | + | |
- | File Catalog Revision: 203 (expires in 3 minutes) | + | |
- | File Catalog ID: 38758963141a14961037b7bc7759648f9a95cdf3 | + | |
- | No. Active File Catalogs: 1 | + | |
- | Cache Usage: 54581k / 30720000k | + | |
- | File Descriptor Usage: 0 / 130560 | + | |
- | No. Open Directories: | + | |
- | No. IO Errors: 0 | + | |
- | Connection: http:// | + | |
- | Usage: 0 open() calls (hitrate 0.000%), 3 opendir() calls | + | |
- | Transfer Statistics: 10k read, avg. speed: 11k/s | + | |
- | </ | + | |
- | + | ||
- | + | ||
- | + | ||
- | ==== EESSI ==== | + | |
- | One of the repository served by CVMFS is the software compiled by the [[https:// | + | |
- | + | ||
- | EESSI " | + | |
- | * multiple architectures (ARM, RISC, INTEL, AMD, NVIDIA etc) | + | |
- | * can leverage the lack of manpower | + | |
- | * usable in commercial cloud worldwide | + | |
- | * optimized for specific generations of microprocessors (AVX, AVX512, ARM SVE) | + | |
- | * integrated with lmod | + | |
- | * arch-spec (CPU detection) | + | |
- | + | ||
- | Usage: | + | |
- | + | ||
- | < | + | |
- | (baobab)-[sagon@login2 ~]$ source / | + | |
- | Found EESSI repo @ / | + | |
- | archdetect says x86_64/ | + | |
- | Using x86_64/ | + | |
- | Using / | + | |
- | Found Lmod configuration file at / | + | |
- | Initializing Lmod... | + | |
- | Prepending / | + | |
- | Environment set up to use EESSI (2023.06), have fun! | + | |
- | </ | + | |