Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would it be possible to keep libpython.a? #91

Closed
rdb opened this issue Jan 11, 2017 · 16 comments · Fixed by #1250
Closed

Would it be possible to keep libpython.a? #91

rdb opened this issue Jan 11, 2017 · 16 comments · Fixed by #1250

Comments

@rdb
Copy link
Contributor

rdb commented Jan 11, 2017

The application I'm trying to build has a particular need to embed the interpreter into an executable, which requires static linking with libpython.

Would it perhaps be possible not to delete libpython.a from the image, but keep it in the _internal directory for those who are absolutely certain they need to use it?

@Moguri
Copy link

Moguri commented Apr 28, 2018

Can we get some discussion going on this? Is this something that is a complete non-starter, or are pull requests welcome? What are the issues/concerns with something like this? Larger containers? Are wheels affected? Maybe this is something we can discuss for manylinux2010 if there is hesitation to have it as part of manylinux1.

@njsmith
Copy link
Member

njsmith commented Apr 28, 2018

I think this is out of scope for manylinux, since it's about building portable binaries of your application, not about building manylinux wheels. You're welcome to reuse our scripts and so on, but i don't want to add it to the image. It's a lot of extra download weight.

@njsmith
Copy link
Member

njsmith commented Apr 28, 2018

Note that if you start with something like the "holy build box" then building a copy of python to embed is probably not too difficult?

@Moguri
Copy link

Moguri commented Apr 28, 2018

This is for building Panda3D wheels. We need libpython.a to provide users of the Panda3D wheel the necessary tools to deploy/freeze their Panda3D applications. At the moment, we have to keep re-integrating upstream manylinux1 changes into our Docker images for building Panda3D wheels, and it is getting tiresome. It would be much preferable to start with a manylinux1 base image.

Would you be open to discussion around making the manylinux1 scripts easier to modify? For example, if removing the libpython.a files was done in another layer of the image, we could just worry about modifying the Dockerfile (remove the line to remove libpython.a and add necessary packages for Panda3D) instead of modifying both the Dockerfile and the build scripts. This would make merging in upstream changes easier.

@njsmith
Copy link
Member

njsmith commented Apr 28, 2018

Huh, so you're... shipping a copy of Python, inside your Python wheels?

@rdb
Copy link
Contributor Author

rdb commented Apr 29, 2018

Sort of; we're shipping a stub executable that embeds the interpreter and runs whatever Python bytecode is appended to the end of the executable, similar to what py2exe does. This allows users to easily compile their Python code into executables; we just need to grab the appropriate .whl of the dependency packages and inject their bytecode into the stub.

Compiling this stub requires linking in libpython.a. It would save us a significant maintenance burden if there could be some easy way to have manylinux1 contain the libpython.a file somewhere. If not, we'll just have to continue to maintain our own fork of manylinux. That said, perhaps it could be an option to enable when building the docker images, eg. via an environment variable?

@SylvainCorlay
Copy link

@Moguri we are finding the exact same usecase for our project and would love to see libpython.a included in the docker image.

Note that we are currently building our own docker image from a fork of the manylinux repository. Unfortunately, your base docker image is private making it not reproducible.

@Moguri
Copy link

Moguri commented Feb 15, 2020

@SylvainCorlay The dockerfile we are using can be found here.

@SylvainCorlay
Copy link

SylvainCorlay commented Feb 15, 2020

@Moguri thanks. I meant to say that the base docker image of the official manylinux (https://github.com/pypa/manylinux/blob/master/docker/Dockerfile-x86_64#L2) is private.

I have opened #481 about this.

@rdb
Copy link
Contributor Author

rdb commented Feb 17, 2021

I would kindly like to resubmit this for consideration. Until now, we have been able to build our own fork of manylinux that removes the rm libpython*.a command, but we would like to start offering aarch64 wheels, and it turns out to be prohibitive for us to build our own aarch64 manylinux image (we don't have access to the right hardware, and an attempt to build via qemu ground to a complete halt after 12 hours).

It would really help us a lot to have libpython kept in the Docker image, even if it's moved somewhere totally out of sight, so that we can continue building our executable that embeds the Python interpreter.

@Helveg
Copy link

Helveg commented Oct 2, 2021

Kind of a counterargument to just including it. I didn't know much about the downsides of building against a shared mode interpreter and with find . -name libpython*.so no location would've been out of sight.

@rdb
Copy link
Contributor Author

rdb commented Jan 6, 2022

Annual bump. This would be a huge help for us. Our library includes a binary that embeds the interpreter, which requires linking against libpython.a. So far we are using a modified version of manylinux for i686 and x86_64, but it is prohibitively difficult to build our own manylinux for some of the architectures that require cross-compilation.

Is there anything we can do to help move this issue along?

@mayeut
Copy link
Member

mayeut commented Jan 8, 2022

@rdb,

just on a side, I think travis-ci now offers again free usage for aarch64, s390x & ppc64le as part of their Partner Queue

Is there anything we can do to help move this issue along?

Yes there is. I'll outline what would be acceptable for a PR:

  • to prevent misusage of the libraries, they should not be available as a default (i.e. requires a step to make them available, probably through an helper script, other propositions welcome).
  • the size of the image (both compressed & uncompressed) shall not grow too much (let's say 0.5% as a threshold here) as this would impact 99.5% of users that probably don't need those (if I'm wrong, we can revisit the threshold a bit).

@mathstuf
Copy link

mathstuf commented Jan 8, 2022

to prevent misusage of the libraries, they should not be available as a default (i.e. requires a step to make them available, probably through an helper script, other propositions welcome).

Build systems in use in the wild will adapt and seek the library out as needed (e.g., I'd expect CMake's FindPython to gain support at some point). Do you mean "more than a simple -lpython flag" as the limit here or must one do something like extract a password-protected zip to make the library available?

@rdb
Copy link
Contributor Author

rdb commented Jan 8, 2022

Thanks! I'll get to work on that. No worries, I'll do it in a way that build systems or users won't be able to find it automatically.

From a quick experiment, all the libraries compressed together (so that they share a dictionary) makes less than 3 MB, well under the 0.5 % threshold. However, because of Docker's layer system and because the Python versions are all built separately, realising these gains is a little trickier. Hmm.

@mayeut
Copy link
Member

mayeut commented Jan 8, 2022

From a quick experiment, all the libraries compressed together (so that they share a dictionary) makes less than 3 MB, well under the 0.5 % threshold.

Good !

However, because of Docker's layer system and because the Python versions are all built separately, realising these gains is a little trickier. Hmm.

All CPythons versions are copied in a single step in the final image for various reasons. That means you should be able to achieve what you observed. The CPython individual layers are used for build cache only and it's less of an issue for them to grow a bit more. End users will never pull those layers.
The only layer that matters is:

COPY --from=all_python /opt/_internal /opt/_internal/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants