-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nightly Release of Torch-MLIR python wheels for linux_x86_64 builds #1
Conversation
bc8cb18
to
0e9f30d
Compare
- name: Make assets available in dist | ||
run: | | ||
mkdir dist | ||
cp build_tools/python_deploy/wheelhouse/torch*.whl dist/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This includes both torch and torch_mlir wheels, sticking to the convention of previous snapshot releases.
minor fix py_version fix test hendrik/ccache, and split py3.10 and py3.11 insto separate jobs include torch wheels too, add release body message remove ccache entirely try GITHUB_TOKEN elevate permissions for GITHUB_TOKEN trial fix fix fix fixes
0e9f30d
to
db550a3
Compare
Thanks. As discussed on discord, let's land and iterate on these. It can be tricky to bring things up from scratch. I'll do a detailed review of the whole. |
The first set of wheels are up: https://github.com/llvm/torch-mlir-release/releases 🎉
|
…r dynamo exported models (#37) This PR includes support for calling into the TorchDynamo export + FX import APIs in torch-mlir to generate `Torch` dialect, followed by conversions to TCP dialect. This is bundled together with a fully functional hermetic python sandbox in TCP bazel, with access to torch-mlir and core mlir's python APIs and ability to run python lit/integration style tests. This depended on a bunch of foundational work: 1. Add a hermetic python interpreter to `mlir-tcp` bazel: Needed for mlir/torch-mlir pybind integrations, python lit+integration tests. 2. Register python bindings for torch-mlir/core mlir: Upstreaming torch-mlir's bazel python build had several issues (as it's heavily dependent on Cruise's internal bazel workspace), so I decided to instead install from pre-built python wheels (for torch and torch-mlir). This allows keeping the setup in TCP lean, and leverage/reuse torch-mlir's Cmake python build for any python integrations. 3. Automate releases of torch-mlir python binaries (llvm/torch-mlir-release#1): This was [discontinued](https://discourse.llvm.org/t/rfc-discontinuing-pytorch-1-binary-releases/76371) as a part of the CI revamp. 4. Support for running python lit tests: Updated lit configs with python interpreter integration. 5. Add `fx_importer/basic_test.py` (as a python lit test) and mlir lit tests for TorchToTcp conversions. Other minor workflow updates: - Use `--test_output=errors` for better log readability in CI failures - Add setup-build/action.yml to consolidate workspace setup - Include disk space cleanup actions (due to OOM issues with default GHA runners) - Bump actions versions P.S. Our stablehlo build fails at the moment due to disk space issues (unrelated to this change, verified it passes locally). This is just an outcome of using default GitHub hosted runners for CI with very limited disk space resources. Adding some cleanup steps helped with llvm build but still fails stablehlo. We might want to look into self-hosted runners with larger memory/compute going forward. For now the non-TCP workflows are not merge-gating (optional) so this is not a blocker to this PR landing.
Reuses build_linux_packages.sh script to build the project and generate python wheels. I however chose to not port the original github actions workflows for snapshot releases, which seemed to do a lot more than what I initially wanted. Here's what's roughly different in this trimmed down version:
I used a standalone fork which was useful for the initial bringup - it took a good 30-ish iterations on
main
branch to nail the github workflow right, and some playing with auth tokens.Here's how the release page looks like:
(it follows the same naming convention as existing wheels, but updates the same
dev-wheels
release in place to avoid changing the expand_assets link required for pip-style hosting)Here's the workflow:
@stellaraccident and I took a call to avoid using Personal Access Tokens (PAT) here for release permissions, and instead elevating permissions on GITHUB_TOKEN for
contents
scope (required for an action to make a release). We think this is reasonable considering this repo is locked down in terms of number of contributors with write access, and that this is scope specific and not a write-all.