User Tools

Site Tools


hpc:best_practices

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
hpc:best_practices [2021/01/25 16:45]
Yann Sagon [Single thread vs multi thread vs distributed jobs]
hpc:best_practices [2023/05/26 15:05]
Adrien Albert
Line 1: Line 1:
-<title> Best practices and smart use of the HPC resources </title>+{{METATOC 1-5}}
  
  
-This page gives best practices and tips on how to use the clusters **Baobab** and **Yggdrasil**. 
  
 ====== Introduction ====== ====== Introduction ======
 +This page gives best practices and tips on how to use the clusters **Baobab** and **Yggdrasil**.
 +
  
 An HPC cluster is an advanced, complex and always-evolving piece of technology. It's easy to forget details and make mistakes when using one, so don't hesitate to check this section every now and then, yes, even if you are the local HPC guru in your team! There's always something new to learn ! An HPC cluster is an advanced, complex and always-evolving piece of technology. It's easy to forget details and make mistakes when using one, so don't hesitate to check this section every now and then, yes, even if you are the local HPC guru in your team! There's always something new to learn !
Line 87: Line 88:
 ===== Single thread vs multi thread vs distributed jobs ===== ===== Single thread vs multi thread vs distributed jobs =====
  
-There are three job categories each with different needs: +See [[hpc:slurm#single_thread_vs_multi_thread_vs_distributed_jobs|here]] to be sure you specify the correct configuration for your job type 
- +
-^Job type             ^ Number of cpu used                                             ^ Examples         ^ Keywords         ^ Slurm            ^ +
-**single threaded** | **one CPU**                                                    | Python, plain R  | -                |  +
-| **multi threaded**  | **all the CPUs** of a compute node (best case scenario)        | Matlab, Stata-MP | OpenMP, SMP      | <nowiki>--cpus-per-tasks</nowiki>+
-| **distributed**     | can spread tasks on multiple compute nodes                     | Palabos OpenFOAM | OpenMPI, workers | <nowiki>--ntasks</nowiki>         | +
- +
- +
-There are also **hybrid** jobs, where each tasks of such a job behave like a multi-threaded job.  +
-This is not very common and we won't cover this case. +
- +
-In slurm, you have two options for asking CPU resources: +
- +
-  * ''<nowiki>--cpus-per-tasks</nowiki>'': this will specify that you want more than one CPU per task.  +
-  * ''<nowiki>--ntasks</nowiki>'': this will launch n time your job. **ONLY** specify a value bigger than one if your job knows how to handle multitasking properly. For example OpenMPI job can benefit of this option. If your job doesn't handle this option correctly, it will be launched n time doing strictly the same things, this is not what you want and will wait resources and create corrupted output files.+
  
  
Line 154: Line 141:
   * Do I want to receive email notification ?   * Do I want to receive email notification ?
  * This is optional, but you can specify the level of details you want with the ''<nowiki>--mail-type</nowiki>'' parameter  * This is optional, but you can specify the level of details you want with the ''<nowiki>--mail-type</nowiki>'' parameter
 +
 +====== Transfer data from cluster to another with ======
 +===== Rsync =====
 +This help assumes you want transfer the directory ''<nowiki>$HOME/my_projects/the_best_project_ever</nowiki>'' from baobab to yggdrasil at the same path. You can adapt your case by changing the variables.
 +
 +
 +__**Rsync options:**__
 +  * ''<nowiki>-a, --archive</nowiki>''This is equivalent to ''<nowiki>-rlptgoD</nowiki>''. It is a quick way of saying you want recursion and want to preserve almost everything (with ''<nowiki>-H</nowiki>'' being a notable omission). The only exception to the above equivalence is when ''<nowiki>--files-from</nowiki>'' is specified, in which case -r is not implied.
 +  * ''<nowiki>-i</nowiki>'' turns on the itemized format, which shows more information than the default format
 +  * ''<nowiki>-b</nowiki>'' makes rsync backup files that exist in both folders, appending ~ to the old file. You can control this suffix with ''<nowiki>--suffix .suf</nowiki>''
 +  * ''<nowiki>-u</nowiki>'' makes rsync transfer skip files which are newer in dest than in src
 +  * ''<nowiki>-z</nowiki>'' turns on compression, which is useful when transferring easily-compressible files over slow links
 +  * ''<nowiki>-P</nowiki>'' turns on --partial and --progress
 +  * ''<nowiki>--partial</nowiki>'' makes rsync keep partially transferred files if the transfer is interrupted
 +  * ''<nowiki>--progress</nowiki>''  shows a progress bar for each transfer, useful if you transfer big files
 +  * ''<nowiki>-n, --dry-run</nowiki>''  perform a trial run with no changes made
 +
 +1) Go to your directory containing ''<nowiki>the_best_project_ever</nowiki>'':
 +<code>
 +(baobab)-[toto@login2 ~]$cd $HOME/my_projects/
 +</code>
 +
 +2) Set the variables (or not)
 +<code>
 +(baobab)-[toto@login2 my_projects]$ DST=$HOME/my_projects/
 +(baobab)-[toto@login2 my_projects]$ DIR=the_best_project_ever
 +(baobab)-[toto@login2 my_projects]$ YGGDRASIL=login1.yggdrasil
 +</code>
 +3) Run the rsync
 +<code>
 +(baobab)-[toto@login2 my_projects]$ rsync -aviuzPrg ${DIR} ${YGGDRASIL}:${DST}
 +</code>
hpc/best_practices.txt · Last modified: 2023/05/26 15:07 by Adrien Albert