Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singleton init mode, MPI_Comm_spawn, and OPAL_PREFIX #12349

Open
dalcinl opened this issue Feb 19, 2024 · 22 comments
Open

Singleton init mode, MPI_Comm_spawn, and OPAL_PREFIX #12349

dalcinl opened this issue Feb 19, 2024 · 22 comments

Comments

@dalcinl
Copy link
Contributor

dalcinl commented Feb 19, 2024

I'm working on producing a binary Open MPI Python wheel to allow for pip install openmpi. I had to add a few hacks here and there to overcome a few issues strictly related to how wheel files are stored, how they are installed in Python virtual environments, and the lack of post-install hooks, and the expectations of things to work without activating the environment . All this of course requires relocating the Open MPI installation. And at that point I've found and minor issue that I don't know how to overcome.

First, a clarification: I'm using internal PMIX and PRRTE. Supposedly, OPAL_PREFIX env var is all what is needed for things to work out of the box when relocating an Open MPI install. However, I think I came across a corner case. If using singleton init mode, then I believe OPAL_PREFIX is simply ignored, and if tools are not located via $PATH, then things do not work.

Looking at the spawn code, I see a function start_dvm in ompi/dpm/dpm.c.
This function start_dvm has the following code:

    /* find the prte binary using the install_dirs support - this also
     * checks to ensure that we can see this executable and it *is* executable by us
     */
    cmd = opal_find_absolute_path("prte");

However, I believe opal_find_absolute_path() does not care at all about the OPAL_PREFIX env var, it only uses PATH, eventually. The comment find the prte binary using the install_dirs support is simply not true.

@hppritcha Your input is much appreciated here. I do have a reproducer, but so far it is based on Python and intermediate binary assets I'm generating locally. I hope the description above is enough for you to realize the issue.

@hppritcha
Copy link
Member

I'll take a look at adjusting the path search. Having enough problems with #12307 at the moment.

@dalcinl
Copy link
Contributor Author

dalcinl commented Feb 19, 2024

I'll take a look at adjusting the path search. Having enough problems with #12307 at the moment.

Sorry! 😢

@wenduwan
Copy link
Contributor

@dalcinl Just to be clear is this a problem on main branch as well?

@dalcinl
Copy link
Contributor Author

dalcinl commented Feb 22, 2024

@wenduwan I discovered this issue testing with the 5.0.2 tarball. I cannot assert anything about the main branch, but a quick glance at the code seems to indicate that the root issue is also in there.

@wenduwan
Copy link
Contributor

Adding main label so we don't ignore it for e.g. 5.1.x

@dalcinl
Copy link
Contributor Author

dalcinl commented Feb 22, 2024

@jsquyres Would Open MPI accept platform-specific ways to derive OPAL_PREFIX automatically, alleviating users from the need to set it explicitly?

@rhc54
Copy link
Contributor

rhc54 commented Feb 22, 2024

All that is needed is to translate OPAL_PREFIX to PRTE_PREFIX in mpirun - but you can only do that if you are using the internal PRRTE. Same is true for PMIX_PREFIX.

@rhc54
Copy link
Contributor

rhc54 commented Feb 22, 2024

If this is about singleton comm-spawn, then you need to put the translation into the bottom of the dpm.c file before we fork/exec prte. Again, with external PRRTE/PMIx protected.

@dalcinl
Copy link
Contributor Author

dalcinl commented Feb 22, 2024

@rhc54 Arent you are oversimplifying things a bit? opal_find_absolute_path("prte") will keep ignoring any XXX_PREFIX variable you may set, as it only looks in PATH.

I know almost nothing about the ompi codebase, but it looks to me that you have to, after the environment var translation you described, somehow replicate the logics of find_prterun() in ompi/tools/mpirun/main.c [link]. I may be saying nonsense, in that case just ignore me.

@rhc54
Copy link
Contributor

rhc54 commented Feb 22, 2024

I wasn't oversimplifying things - just pointing out that this isn't a difficult problem. Sure, you need to update the dpm code to duplicate that in mpirun, but that's a copy/paste operation. The code in mpirun is actually doing the "right thing" - it's just the dpm that is out of date.

@rhc54
Copy link
Contributor

rhc54 commented Feb 22, 2024

Try #12363 - took longer to compile then to do 🤷‍♂️

@wenduwan
Copy link
Contributor

wenduwan commented May 2, 2024

This issue is losing steam.
@dalcinl I wonder if the issue is covered by mpi4py CI? The ompi mpi4py looks green now. Is this still a problem?

@dalcinl
Copy link
Contributor Author

dalcinl commented May 4, 2024

@wenduwan I believe this issue is already fixed in main, at least in my local testing. I'm still getting some warnings, though, but these are unrelated:

[kw61149:2141245] shmem: mmap: an error occurred while determining whether or not /tmp/ompi.kw61149.1000/jf.0/1892286464/shared_mem_cuda_pool.kw61149 could be created.
[kw61149:2141245] create_and_attach: unable to create shared memory BTL coordinating structure :: size 134217728 

And no, the mpi4py CI tests do not specifically test for this issue. After all the work @jsquyres put into it, adding a new workflow should be quite easy if you really want to have a quick test for relocation via OPAL_PREFIX+LD_LIBRARY_PATH.

@wenduwan
Copy link
Contributor

wenduwan commented May 6, 2024

The warnings are real. @edgargabriel and @lrbison both observed them in singleton mode.

I think adding a minimal relocation test is reasonable. IIUC the test will look like the following:

  1. Install Open MPI as normal
  2. Install mpi4py as normal
  3. Move Open MPI installation to another location, e.g. mv /original/install /new/install
  4. export OPAL_PREFIX=/new/install/...
  5. export LD_LIBRARY_PATH=/new/install/lib:$LD_LIBRARY_PATH
  6. Run tests with /new/install/bin/mpirun ... -x LD_LIBRARY_PATH -x OPAL_PREFIX

Please correct me if I'm wrong.

@rhc54
Copy link
Contributor

rhc54 commented May 6, 2024

Not sure I understand this report. Are you saying running mpirun generates a warning? That isn't a singleton issue, so what precisely are you saying is wrong?

@edgargabriel
Copy link
Member

no, its the other way around. With mpirun there is no warning, only with singleton

@rhc54
Copy link
Contributor

rhc54 commented May 6, 2024

Thanks - and the warning is still there on head of current main?

@edgargabriel
Copy link
Member

I think so, I think I saw it a few days ago. I can double check in a few days and look into it, its coming from the btl.smcuda component I think

@hppritcha
Copy link
Member

You see this smcuda warning any time the MPI process doesn't have a prrte pmix server to talk to.
So singleton and direct launched with srun (and likely palsrun) also give the warning, one instance per node of the job as I recall. you don't see this with btl/sm because it has a fallback if there doesn't appear to be a pmix server created session dir.

@rhc54
Copy link
Contributor

rhc54 commented May 6, 2024

Yeah, this has nothing to do with moving the installation. I had posted a patch to silence this warning, but it ran into a bunch of nit-picking and so I deleted it. It's a trivial fix to silence it, if someone else wants to try to get it committed.

@dalcinl
Copy link
Contributor Author

dalcinl commented May 6, 2024

@wenduwan #12526

@wenduwan
Copy link
Contributor

wenduwan commented May 6, 2024

@dalcinl Thank you very much!

@wenduwan wenduwan removed this from the v5.0.4 milestone Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants