Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update build images to use something other than manylinux2014 #122

Closed
flavorjones opened this issue Jun 3, 2024 · 13 comments
Closed

Update build images to use something other than manylinux2014 #122

flavorjones opened this issue Jun 3, 2024 · 13 comments

Comments

@flavorjones
Copy link
Collaborator

flavorjones commented Jun 3, 2024

Context

The x86-linux-gnu and x86_64-linux-gnu build images are based on manylinux2014. manylinux is a set of base images maintained by the Python community for building Wheels (which is what precompiled Python packages are called), and you can read more about the project at https://github.com/pypa/manylinux

manylinux2014 is specified at https://peps.python.org/pep-0599/. It is based on CentOS 7 and supports GLIBC_2.17 and above.

CentOS 7 becomes EOL on 2024-06-30 according to https://www.redhat.com/en/topics/linux/centos-linux-eol

Problems we're trying to solve

  • I'd like to use a more modern base image. Older versions of packages can cause problems, see for example pkg-config version in x86-linux and x86_64-linux images taking a long time to resolve re2.pc #121 which describes the performance issues of pkg-config in these images.
  • I'd like to unify the linux builds a bit.
    • Currently the x86-linux-gnu and x86_64-linux-gnu images are distinct from the other gnu linux images, which use ubuntu:20.04 and only support GLIBC_2.29 and above.
    • And the musl images are built using ubuntu:20.04 and a cross-compilation stack which supports unclear versions of the musl libc API.

Option A: Move to manylinux_2_28 and musllinux_1_2?

Work in progress here: #124

We'll leave the windows and darwin builds to happen via cross-compilation on a vanilla ubuntu system for now. We've got until 2025-04-23 to get off of ubuntu:20.04.

Problems doing this would introduce:

  • It means we would support GLIBC_2.28 for all versions of gnu linux, an improvement on arm and aarch, but a worse situation for x86 support.
    • Though, to be fair, I don't know that any older versions of GLIBC receive production support from distro vendors anymore? So maybe this is OK.
  • It doesn't look like manylinux_2_28 supports 32-bit linux anymore, see the contents of https://quay.io/organization/pypa, so we would likely have to drop 32-bit support
    • Though based on Nokogiri's download stats which has x86-linux at 0.02% of downloads since 2021-01, this might be a non-issue?

Option B: Move to ubuntu:20.04 for everything?

Work in progress here: #126

  • manylinux_2_28 works for x86_64
    • but does not provide support for 32-bit linux
    • and non-x86 images are pinned to that platform so they don't easily cross-compile
  • ubuntu 20.04 seems to support building against glibc 2.29
    • not much worse than 2.28
    • much lower complexity since we unify the image build process

Let's give it a try.

@sxlijin
Copy link

sxlijin commented Jul 23, 2024

Hi - this is now causing an active breakage for me, as a user of rb-sys-dock (over in Rust-FFI land). rb-sys-dock is based on rake-compiler-dock, and with CentOS 7 now EOL, they've broken the default repository mirror list for it.

For me, this manifested as an inability to build OpenSSL dependencies: I need to yum install perl-IPC-Cmd to make my x86_64-linux builds work, and that fails with this error:

Loaded plugins: fastestmirror, ovl
Determining fastest mirrors
Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=container error was
14: curl#6 - "Could not resolve host: mirrorlist.centos.org; Unknown error"

gh workflow logs: https://github.com/BoundaryML/baml/actions/runs/10066707009/job/27828665640

centos mirrorlist deprecation: https://serverfault.com/questions/1161816/mirrorlist-centos-org-no-longer-resolve

@flavorjones
Copy link
Collaborator Author

@sxlijin It's nice to meet you, I'm glad to hear rake-compiler-dock is useful for you and the rb-sys-dock project.

I'm sorry you're getting bit by CentOS being EOLed. This issue is not my highest priority at the moment, though my plan is to work on it later this year. What are your thoughts on the pros and cons I listed in the original post? Are you able to help out with this migration at all?

@flavorjones
Copy link
Collaborator Author

I've updated the OP to include the discovery that modern manylinux doesn't provide support for x86-linux (32-bit linux).

flavorjones added a commit that referenced this issue Jul 27, 2024
As part of #122 and adopting the pypa wheel images, we also must
embrace their support matrix which does not include 32-bit linux as of
manylinux_2_28. See https://github.com/pypa/manylinux for more information.
flavorjones added a commit that referenced this issue Jul 27, 2024
As part of #122 and adopting the pypa wheel images, we also must
embrace their support matrix which does not include 32-bit linux as of
manylinux_2_28. See https://github.com/pypa/manylinux for more information.
@flavorjones
Copy link
Collaborator Author

Work in progress at #124

@sxlijin
Copy link

sxlijin commented Jul 29, 2024

Appreciate the quick response!

For anyone coming here through Google: I'm currently working around this by patching the CentOS default sources (see PR).

Re the task at hand and the pros/cons:

  • 32-bit Linux support is irrelevant for us
  • I'm not sure about the minimum ABI version I would suggest targeting; our Python x86-64 build is actually currently using manylinux2014 (it's probably sufficient for us to simply be using ubuntu-20.04 instead, but manylinux: auto was working out of the box, so we didn't tweak the Python settings we were using)

Our Ruby userbase is also small, and tend to be in more greenfield spaces, so I think compatibility going back to ubuntu-20.04 is what matters to us, and older than that isn't a priority.

Also worth noting: arm64-linux users tend to be on newer platforms (imo largely because arm64-linux wasn't a thing until AWS started rolling out Graviton machines, i.e. arm64 Linux machines which are cheaper and more available than their x64 equivalents).

Sorry if this is a bit rambly: am trying to give you a useful answer between compile cycles. (And unfortunately, I won't really be able to devote any cycles to helping with this.)

@flavorjones
Copy link
Collaborator Author

our Python x86-64 build is actually currently using manylinux2014

Please note that the reason you commented here is because this project uses manylinux2014 which is based on Centos7: https://github.com/pypa/manylinux?tab=readme-ov-file#manylinux2014-centos-7-based-glibc-217 ... your response is confusing to me.

I've got a spike moving to manylinux_2_28 and although it works fine for x86_64-linux-gnu it doesn't solve some other problems, and I'm wondering about moving back to Ubuntu for everything. I'll spend some more cycles on it.

@flavorjones flavorjones changed the title Update build images to use manylinux_2_28 and musllinux_1_2 Update build images to use something other than manylinux2014 Jul 29, 2024
@flavorjones
Copy link
Collaborator Author

Updated OP to note that I'm also exploring unifying everything onto ubuntu:20.04.

@sxlijin
Copy link

sxlijin commented Jul 30, 2024

our Python x86-64 build is actually currently using manylinux2014

Please note that the reason you commented here is because this project uses manylinux2014 which is based on Centos7:

Yes, sorry, I do realize that, and I also agree that it's... uh, confusing on my side.

For some more context: our codebase is Rust, and we ship SDKs using FFI frameworks for Python, TS, and Ruby. Our Python builds rely on a 3rd-party-maintained github action, which under the hood is selecting manylinux2014 for our x64 Linux build, but that build has always Just Worked TM out of the box.

Our Ruby build, by contrast, goes through the rb-sys-dock Docker image (which wraps rake-compiler-dock and adds Rust dependencies), and for some reason, our x64 Linux build for Ruby did not just work. That discrepancy is what led me down the rabbit hole to your project.

@sxlijin
Copy link

sxlijin commented Jul 30, 2024

Aha- I wonder if this is it: the manylinux maintainers have patched the CentOS mirrorlist so that yum install on manylinux2014 should Just Work TM: pypa/manylinux#1628.

So perhaps it might just be an issue of my (transitive) dependency on rake-compiler-dock needs a version bump somewhere in the middle.

@flavorjones
Copy link
Collaborator Author

@sxlijin Oh, interesting, then I may just be able to bump the base image and cut a new release. Let me take a look.

@flavorjones
Copy link
Collaborator Author

See #127

@flavorjones
Copy link
Collaborator Author

@sxlijin OK, you should be good to go if you use the images from https://github.com/rake-compiler/rake-compiler-dock/releases/tag/v1.5.2

@flavorjones
Copy link
Collaborator Author

Closed by #126. If anybody wants to make a case for supporting a version of glibc older than 2.29, please open a new issue! Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants