Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMIx (2.0.1) #466

Closed
koomie opened this issue Jun 11, 2017 · 6 comments
Closed

PMIx (2.0.1) #466

koomie opened this issue Jun 11, 2017 · 6 comments

Comments

@koomie
Copy link
Contributor

koomie commented Jun 11, 2017

https://pmix.github.io/pmix/

@adrianreber
Copy link
Member

I have prepared an upgrade of pmix to 2.0.1 in combination with the Open MPI upgrade to 3.0.0.

I also tried to build slurm against pmix 2.0.1 but it fails. slurm correctly detects pmix 2.0.1:

checking for pmix installation... /opt/ohpc/pub/libs/pmix/2.0.1/

But I get errors during compilation:

$ make
/bin/sh ../../../../libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I../../../.. -I../../../../slurm  -I../../../.. -I../../../../src/common -I/usr/include -I/opt/ohpc/pub/libs/pmix/2.0.1//include -DHAVE_PMIX_VER=2   -DNUMA_VERSION1_COMPATIBILITY -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -pthread -Wall -g -O0 -fno-strict-aliasing -c -o mpi_pmix_v2_la-pmixp_client.lo `test -f 'pmixp_client.c' || echo './'`pmixp_client.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../../../.. -I../../../../slurm -I../../../.. -I../../../../src/common -I/usr/include -I/opt/ohpc/pub/libs/pmix/2.0.1//include -DHAVE_PMIX_VER=2 -DNUMA_VERSION1_COMPATIBILITY -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -pthread -Wall -g -O0 -fno-strict-aliasing -c pmixp_client.c  -fPIC -DPIC -o .libs/mpi_pmix_v2_la-pmixp_client.o
In file included from /usr/include/unistd.h:25:0,
                 from pmixp_common.h:41,
                 from pmixp_client.c:38:
/usr/include/features.h:330:4: warning: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Wcpp]
 #  warning _FORTIFY_SOURCE requires compiling with optimization (-O)
    ^
pmixp_client.c: In function ‘_set_procdatas’:
pmixp_client.c:468:24: error: request for member ‘size’ in something not a structure or union
   kvp->value.data.array.size = count;
                        ^
pmixp_client.c:482:24: error: request for member ‘array’ in something not a structure or union
   kvp->value.data.array.array = (pmix_info_t *)info;
                        ^
make: *** [mpi_pmix_v2_la-pmixp_client.lo] Error 1

@koomie
Copy link
Contributor Author

koomie commented Sep 26, 2017

I checked with a developer who is more in the know on this and the suggestion is to stick with an older variant of PMIx when configuring as standalone with SLURM. Apparently, SLURM does not yet have support for v2.x support - that is targeted for a Nov release. So, I have downgraded our pmix build to v1.2.3 for now; v1.2.4 is supposed to be out soon so we can likely go with that for our Nov release. With this build, the companion SLURM build went ok with pmix enabled (can't comment on functionality yet).

@adrianreber
Copy link
Member

@koomie thanks for merging and the downgrade to make it work with slurm. I also see that you removed the version number from the directory path: %global install_path %{OHPC_LIBS}/%{pname}

Curious why you removed the version?

@koomie
Copy link
Contributor Author

koomie commented Sep 28, 2017

The rationale behind that was due to a desire to be able to change the pmix installation independent of the MPI stacks (and resource managers). Since this is really more of an administrative package that is accessed by packages outside of Lmod (e.g. slurm), it is helpful if there is a constant path to the install. I would not expect a desire to have multiple PMIx installations co-existing which is the motivating factor for versioned paths for all of the development tools/libraries accessed by developers.

I do like having it being transparent into a non-default path (e.g. /opt/ohpc/pub) like you have it, although that is likely going to necessitate the need to add some rpath flags for some of the MPI stacks, or we might consider dropping some ohpc-specific files into /etc/ld.so.conf.d for linker resolution. I'm not 100% what makes the best sense just yet...I was finally able to get openmpi and the latest PMIx build to run successfully with a version of SLURM pointing to the same PMIx. In local test builds (not in OBS) I was also able to get multi-node jobs to run similarly with MPICH. I think I'm close on MVAPICH2 but need to tinker some more.

I'm traveling the next two days, but will keep poking at it and we can iterate.

@koomie
Copy link
Contributor Author

koomie commented Oct 6, 2017

Update: slurm + this standalone pmix + mpich encountered a change in behavior over previous builds in that execution of a singleton failed (e.g. just running MPI binary outside of slurm). Thanks to the support from @rhc54, there is a new patch to make this work (openpmix/openpmix#537) that we are now applying.

koomie added a commit that referenced this issue Oct 14, 2017
…endent

updates with resource managers and MPI stacks (#466).
koomie added a commit that referenced this issue Oct 14, 2017
@koomie koomie added the built label Oct 31, 2017
@koomie
Copy link
Contributor Author

koomie commented Nov 4, 2017

Added PMIx based CI job which is now passing in CI.

@koomie koomie closed this as completed Nov 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants