Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing overriding of the MPI in EESSI #121

Open
ocaisa opened this issue Jul 2, 2021 · 5 comments
Open

Testing overriding of the MPI in EESSI #121

ocaisa opened this issue Jul 2, 2021 · 5 comments

Comments

@ocaisa
Copy link
Member

ocaisa commented Jul 2, 2021

I've been doing some successful testing of #116 and I'd like others to also give it a try. For minimal testing you just need set up the override directory:

# Make a directory for our overrides
sudo mkdir -p /opt/eessi
# Let's allow working in user space
sudo chown $USER /opt/eessi
# Create the necessary directory structure (/cvmfs/pilot.eessi-hpc.org/host_injections is by default a symlink to /opt/eessi)
mkdir -p /cvmfs/pilot.eessi-hpc.org/host_injections/2021.06/software/linux/x86_64/amd/zen2/rpath_overrides/OpenMPI
# If the MPI you want to use is loaded as an eb module you can just do
ln -s $EBROOTOPENMPI /cvmfs/pilot.eessi-hpc.org/host_injections/2021.06/software/linux/x86_64/amd/zen2/rpath_overrides/OpenMPI/system  

The actual test you can run is then (e.g.):

# Check the default linker for the executable can find all the libraries
/cvmfs/pilot.eessi-hpc.org/2021.06/compat/linux/x86_64/usr/bin/ldd /cvmfs/pilot.eessi-hpc.org/2021.06/software/linux/x86_64/amd/zen2/software/OSU-Micro-Benchmarks/5.6.3-gompi-2020a/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bw
# Run a simple MPI test
mpirun -n 2 /cvmfs/pilot.eessi-hpc.org/2021.06/software/linux/x86_64/amd/zen2/software/OSU-Micro-Benchmarks/5.6.3-gompi-2020a/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bw
@ocaisa
Copy link
Member Author

ocaisa commented Jul 2, 2021

I've tested this with other OpenMPI from EESSI, and it worked out of the box when using another MPI built on top of the same Gentoo Prefix. When trying to use MPI from a different prefix layer, I also had to add

system/lib:
total 0
lrwxrwxrwx. 1 ocaisa1 ocaisa1  78 Jul  2 12:16 libdl.so.2 -> /cvmfs/pilot.eessi-hpc.org/2021.06/compat/linux/x86_64/lib/../lib64/libdl.so.2
lrwxrwxrwx. 1 ocaisa1 ocaisa1 115 Jul  2 12:15 libmpi.so.40 -> /cvmfs/pilot.eessi-hpc.org/2021.03/software/linux/x86_64/amd/zen2/software/OpenMPI/4.0.3-GCC-9.3.0/lib/libmpi.so.40

When using an injected OpenMPI, if it is rpath-ed you should have no problems. If it is not, then the ldd test will probably indicate some missing libraries. These will also need to be placed in /cvmfs/pilot.eessi-hpc.org/host_injections/2021.06/software/linux/x86_64/amd/zen2/rpath_overrides/OpenMPI/system/lib or you use LD_LIBRARY_PATH to have them found (but do not put /usr/lib(64) in LD_LIBRARY_PATH, this will break the compat layer).

I tried to get this to work together with Singularity but have not had success yet, advice on how to do this is welcome!

@ocaisa
Copy link
Member Author

ocaisa commented Jul 2, 2021

I also used MPI directly from the host (OpenMPI 3 which is ABI compatible with 4), this also worked but there were a few warnings (that can be suppressed with OMPI_MCA_mca_base_component_show_load_errors=0)

@ocaisa
Copy link
Member Author

ocaisa commented Jul 2, 2021

On CentOS 7, my directories that made this work looked like:

[ocaisa1@node1 OpenMPI]$ pwd
/cvmfs/pilot.eessi-hpc.org/host_injections/2021.06/software/linux/x86_64/amd/zen2/rpath_overrides/OpenMPI

[ocaisa1@node1 OpenMPI]$ ls -l
total 0
drwxrwxr-x. 3 ocaisa1 ocaisa1 17 Jul  2 14:12 OpenMPI_eessi
drwxrwxr-x. 3 ocaisa1 ocaisa1 17 Jul  2 14:13 OpenMPI_host
lrwxrwxrwx. 1 ocaisa1 ocaisa1 12 Jul  2 14:14 system -> OpenMPI_host

[ocaisa1@node1 OpenMPI]$ ls -l OpenMPI_*/lib
OpenMPI_eessi/lib:
total 0
lrwxrwxrwx. 1 ocaisa1 ocaisa1  78 Jul  2 12:16 libdl.so.2 -> /cvmfs/pilot.eessi-hpc.org/2021.06/compat/linux/x86_64/lib/../lib64/libdl.so.2
lrwxrwxrwx. 1 ocaisa1 ocaisa1 115 Jul  2 12:15 libmpi.so.40 -> /cvmfs/pilot.eessi-hpc.org/2021.03/software/linux/x86_64/amd/zen2/software/OpenMPI/4.0.3-GCC-9.3.0/lib/libmpi.so.40

OpenMPI_host/lib:
total 0
lrwxrwxrwx. 1 ocaisa1 ocaisa1 24 Jul  2 12:26 libhwloc.so.5 -> /usr/lib64/libhwloc.so.5
lrwxrwxrwx. 1 ocaisa1 ocaisa1 36 Jul  2 12:25 libmpi.so.40 -> /usr/lib64/openmpi3/lib/libmpi.so.40

Both of these tests had rpath-ed OpenMPI builds, without RPATH you would need to add additional libraries (libopen-rte.so.40, libopen-pal.so.40 are the minimum I think...or you just use LD_LIBRARY_PATH)

@ocaisa
Copy link
Member Author

ocaisa commented Jul 12, 2021

I also tried something similar to this to test overriding MPI on AWS Skylake with EFA (using LD_PRELOAD to force picking up my provided libraries as we don't have a 2021.06 stack for this yet). There is no real performance difference (minimum latency about 17 microseconds, maximum p2p bandwidth of about 9000 MB/s), however it is clear that there are cases where this may not be perfect:

Program:     gmx mdrun, version 2020.4-MODIFIED
Source file: src/gromacs/hardware/hardwaretopology.cpp (line 614)
Function:    gmx::{anonymous}::parseHwLoc(gmx::HardwareTopology::Machine*, gmx::HardwareTopology::SupportLevel*, bool*)::<lambda()>
MPI rank:    3 (out of 48)

Assertion failed:
Condition: (hwloc_get_api_version() >= 0x20000)
Mismatch between hwloc headers and library, using v2 headers with v1 library

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

The problem is that hwloc version needed by the AWS OpenMPI is not API compatible with the one used by GROMACS. Hopefully a corner case...

Job script for latency/bandwidth
#!/bin/bash -x
#SBATCH --time=00:20:00
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --nodes=2

module load OSU-Micro-Benchmarks

ldd $(which osu_latency)

mpirun -n 2 osu_latency
mpirun -n 2 osu_bw

export LD_PRELOAD=/opt/amazon/openmpi/lib64/libmpi.so.40:/opt/amazon/openmpi/lib64/libopen-rte.so.40:/opt/amazon/openmpi/lib64/libopen-pal.so.40:/lib64/libhwloc.so.5:/lib64/libevent_core-2.0.so.5:/lib64/libevent_pthreads-2.0.so.5:/lib64/libnl-3.so.200:/lib64/libnl-route-3.so.200

ldd $(which osu_latency)

/opt/amazon/openmpi/bin/mpirun -n 2 osu_latency
/opt/amazon/openmpi/bin/mpirun -n 2 osu_bw
Job script for GROMACS test
#!/bin/bash -x
#SBATCH --time=00:20:00
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=2
#SBATCH --nodes=2

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

module load GROMACS

rm logfile.log ener.edr

ldd $(which gmx_mpi)

mpirun -n 48 gmx_mpi mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 20000 -g logfile -dlb yes

export LD_PRELOAD=/opt/amazon/openmpi/lib64/libmpi.so.40:/opt/amazon/openmpi/lib64/libopen-rte.so.40:/opt/amazon/openmpi/lib64/libopen-pal.so.40:/lib64/libhwloc.so.5:/lib64/libevent_core-2.0.so.5:/lib64/libevent_pthreads-2.0.so.5:/lib64/libnl-3.so.200:/lib64/libnl-route-3.so.200

ldd $(which gmx_mpi)

rm logfile.log ener.edr

/opt/amazon/openmpi/bin/mpirun -n 48 gmx_mpi mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 20000 -g logfile -dlb yes

@ocaisa
Copy link
Member Author

ocaisa commented Aug 4, 2021

Regarding hwloc, it is wise that we inspect the ABI of the version that gets pulled in with MPI and check it's compatibility with the version that EESSI uses (https://www.open-mpi.org/projects/hwloc/doc/v2.4.0/a00364.php#faq_version_abi). The issue is likely to arise with older underlying OSes (like CentOS7), we don't need to fail out (since this would only affect packages that rely on MPI and have a hwloc dependency) but we should probably write a warning that this issue may arise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant