LD_LIBRARY_PATH
issues?Resource temporarily unavailable
)Yes, absolutely. You do not need root privileges to install Julia and its packages. However, often times it's better to put the Julia depot - the place where packages, artifacts, and other things will be stored - on the parallel file system rather than into $HOME
. See the Getting Started section for more information.
No, you should not. Use the pre-built binaries from the website or, preferably, a corresponding Lmod module provided by the cluster admins. Compiling Julia from source will generally not improve the performance of your code (Julia is a compiler in itself) but may very well be cumbersome. Unless you know what you're doing and have a very good reason (e.g. creating a debug build with flags like USE_INTEL_JITEVENTS=1
) you should not compile Julia yourself.
The installation path of the julia
executable will be added to your ~/.bashrc
by juliaup. If the job scheduler does not load .bashrc
in non-interactive jobs, you must use the full path to julia
in your job script. You can find out the full path by starting the Julia REPL and typing Sys.BINDIR
. By default, it should be ~/.juliaup/bin/julia
.
Ideally, you should set JULIA_DEPOT_PATH
to point to a place with the following properties:
good (parallel) I/O
no tight quotas
read and write access
no automatic deletion of unused files (or otherwise you have to find a workaround)
As a rule of thumb: Put the Julia depot on the parallel file system (typically $SCRATCH
).
Assuming that SLURM is used on your HPC cluster, you should generally start Julia under srun
, e.g. srun julia --project mycode.jl
. This is especially important if your code is MPI-parallel, in which case srun
is a replacement for mpirun
/mpiexec
, but also recommended for serial code (there are at least a few reasons).
Note that you can use the slim mpiexecjl
wrapper from MPI.jl
to use the "correct" MPI driver automatically.
Set JULIA_CPU_TARGET
to a value that is generic enough to cover all the types of CPUs that you're targeting. You can get the CPU target name for the current system with Sys.CPU_NAME
.
Example: JULIA_CPU_TARGET="generic;skylake-avx512,clone_all;znver2,clone_all"
.
This compiles all functions (clone_all
) for Intel Skylake (skylake-avx512
), AMD Zen 2 (znver2
), and a generic fallback (generic
).
For more information, see this section of the Julia documentation and this section of the developer documentation.
While the Distributed standard library provides some convenient tools and has its use cases you should generally use MPI.jl to scale your code up (e.g. to thousands of compute nodes). Not only is MPI the established standard for distributed computing in any programming language, it also makes use of fast interconnects in HPC clusters (which the Distributed standard library currently doesn't).
If JLLs work fine for you then use them. JLLs have the big advantage that they are convenient and, in many cases, "just work" out of the box. System software can (but doesn't necessarily) give better performance but overriding the relevant bits of JLLs can be cumbersome. Generally speaking, we recommend to only manually replace JLL libraries by system software if JLLs don't work (e.g. if a vendor specific MPI is required). However, in such a case it would be even better to nudge the HPC admins and make this setup permanent and generally available in form of a Julia Lmod module.
In a distributed computing scenario with, e.g., multiple thousands of Julia (MPI) processes, accessing the same Julia depot on a shared file system - when loading packages and precompiled cache files on using PackageX
- can become (very) time consuming. A workaround is to bundle up the Julia depot (e.g. as a .tar.gz
), distribute it to the local node storage (if available) or local memory (often times mounted as /tmp
) of all assigned compute nodes, and then set the JULIA_DEPOT_PATH
accordingly.
LD_LIBRARY_PATH
issues?When using Julia on a system that uses an environment-variable based module system (such as modules or Lmod), the LD_LIBRARY_PATH
variable might be filled with entries pointing to different packages and libraries. This might (or might not) lead to issues stemming from Julia loading libraries other than the ones packaged with it. If you encounter such problems, make sure that Julia's lib
directory is always the first directory in LD_LIBRARY_PATH
.
One possibility to achieve this is to create a wrapper shell script that modifies LD_LIBRARY_PATH
before calling the Julia executable. Inspired by a script from UCL's Owain Kenway:
#!/usr/bin/env bash
# This wrapper makes sure the julia binary distributions picks up the GCC
# libraries provided with it correctly meaning that it does not rely on
# the gcc-libs version.
# Dr Owain Kenway, 20th of July, 2021
# Source: https://github.com/UCL-RITS/rcps-buildscripts/blob/04b2e2ccfe7e195fd0396b572e9f8ff426b37f0e/files/julia/julia.sh
location=$(readlink -f $0)
directory=$(readlink -f $(dirname ${location})/..)
export LD_LIBRARY_PATH=${directory}/lib/julia:${LD_LIBRARY_PATH}
exec ${directory}/bin/julia "$@"
Note that using readlink
might not be optimal from a performance perspective if used in a massively parallel environment. Alternatively, hard-code the Julia path or set an environment variable accordingly.
If a job has non-exclusive access to a node and has a memory limit that is lower than the total memory of the node, set the --heap-size-hint
command line option to an appropriate value when starting Julia in the job script, e.g. julia --heap-size-hint=4G my_script.jl
if you have requested a memory limit of 4G for running my_script.jl
. This communicates the memory limit to Julia's garbage collector to enable more aggressive garbage collection when the memory limit is approached.
Try setting JULIA_CUDA_MEMORY_POOL=none
(see the CUDA.jl documentation for more information).
Resource temporarily unavailable
)By default, Julia uses many parallel tasks during precompilation. On the login nodes of some HPC clusters, parallel processes might be subject to resource restrictions. In these cases, you might want to set JULIA_NUM_PRECOMPILE_TASKS
to a low value, e.g. export JULIA_NUM_PRECOMPILE_TASKS=1
(single task).
Yes, at least for CUDA.jl. See this part of the CUDA.jl documentation.