TOC for all the pages in the "HPC" namespace
HPC start
HPC User Documentation
HPC clusters at UNIGE
Use the HPC resources
Other resources
Support - get help
How to use Linux
Linux on HPC clusters
External resources
Baobab and Yggdrasil - our HPC clusters and their infrastructure
How our clusters work
The clusters : Baobab and Yggdrasil
How do our clusters work ?
Overview
Cost model
Price per hour
Private nodes
How do I use your clusters ?
For advanced users
Infrastructure schema
Compute nodes
GPUs models on the clusters
Bamboo (coming soon)
Baobab
CPUs on Baobab
GPUs on Baobab
Yggdrasil
CPUs on Yggdrasil
GPUs on Yggdrasil
Monitoring performance
Job accounting
Access : SSH, X2GO
Access the clusters
Account
Standard account
Outsider account
Inactivity Notice and Account Deletion Policy
Cluster connection
login nodes
Connect using SSH
ssh key
From Linux and Mac OS
From Windows
Access to the compute nodes
GUI access / Desktop with X2Go
File transfer
From Linux
From Windows
SSH tunnel and socks proxy
Troubleshooting
X11 forwarding not working on mac
"I cannot type my password"
Banned after typing the wrong password 3 times
Connection refused
"I forgot my password, can you reset it ?"
Impossible to connect
Check the "current issues"
Cannot change locale (UTF-8) (Mac users)
Keyboard issue (Mac users)
Switch edu-ID login issue
Applications and libraries
Applications on the clusters
Module - lmod
How to use 'module'
What do I do when an application is not available via 'module' ?
Detailed example of using 'module'
Loading 'R'
Choosing the compiler toolchain
FOSS toolchain
Intel toolchain
Intel compiler licenses
fosscuda toolchain
Examples for selected applications
OpenMPI
Specify MCA parameters through ''srun''
Conda
Conda environment management
Package management
ADF
Gaussian
Git
Gurobi
Jupyter notebook and Jupyter lab
Mathematica
Matlab
Parallel with Matlab
Pass sbatch arguments to Matlab
Compile your Matlab code
Matlab PATH
Matlab java.opts
CHROMIUM mailbox/texture errors
Wavelab
OpenCL
Distant Paraview
Python
Custom Python lib
Pip install from source
R project and RStudio
RStudio
R packages
Variant Effect Predictor (VEP)
Install species
Apptainer (was Singularity)
Intro
Pull an existing image
Convert a Docker image
Run a container
Modify the image (not persistent)
Modify the image (persistent)
References
Stata
TensorFlow
Compile and install a software in your /home
Storage
Storage
Cluster storage
Home directory
Quota
Scratch directory
Quota
Local storage
Scratch directory (local on each node)
Temporary private space
Temporary shared space
Sharing files with other users
Best practices
I/O performance
Check disk usage on the clusters
Check disk usage on home and scratch
Check disk usage on NASAC
File transfer (between nodes)
Backup
Archive
Access external storage
NASAC
Troubleshooting
Where does gio mounts my data?
List the user DBUS process
CVMFS
Robinhood
Policies
Report
Slurm and job management
Slurm and job management
What is Slurm ?
Resources
Partitions
What is a partition ?
Which partition for my job ?
Partitions lists
Clusters partitions
Private partitions
Wall time
Memory
GPU
CPU types
Single thread vs multi thread vs distributed jobs
Submitting jobs
Batch mode (sbatch)
Monothreaded jobs
Multithreaded jobs
Distributed jobs
GPGPU jobs
Interactive jobs
Job array
Advanced usage
Job dependency
Master/Slave
Checkpoint
Reservation
Job monitoring
Email notification of job events
Memory and CPU usage
Energy usage
CPUs
GPUs
Job history
Report and statistics with sreport
Other tools
spart
pestat
seff
HDF5 profiling plugin
Cancel jobs
Job priorities
How is the priority of a job determined ?
Priority vs. waiting time
Backfill mechanism
Best practices and smart use of the HPC resources
Introduction
First steps
Rules and etiquette
Think green
Stop wasting resources!
Which resources ?
Single thread vs multi thread vs distributed jobs
Bad CPU usage
Bad memory usage
Bad time estimation
Conclusion
How to write a good Slurm sbatch script
Transfer data from cluster to another with
Rsync
Glossary
Glossary
FAQ
FAQ - Frequently asked questions
General
Which cluster should I use
I'm lost, where can I find support ?
Citation, publication and acknowledgments
The cluster is slow
Account
When does my account expire
I'm leaving UNIGE, can I continue to use Baobab HPC service?
Storage
Applications
What applications are installed on Baobab ?
Can you install the software XYZ on Baobab ?
Can I use any Microsoft Windows software ?
Can I use a proprietary licensed software ?
I need a different Linux distributions/version, am I stuck ?
Running jobs (SLURM)
I am already familiar with ''torque/pbs/sge/lsf/...'', what are the equivalent concepts in slurm ?
Can I run some small test runs in the login node ?
I want to run several time the same job with different parameters...
What partition should I choose ?
Can I launch a job longer than 4 days ?
How are the priorities computed ?
My jobs stay a long time in the pending queue...
Can I run interactive tasks ?
I'm not able to use all the cores of a compute node
Troubleshooting
Check ssh key
Illegal instruction