From 7fa347a61f8974f57e12828f5216f9abf2f3f4e8 Mon Sep 17 00:00:00 2001 From: "Yu-Hang \"Maxin\" Tang" Date: Sun, 1 Oct 2023 23:57:07 -0700 Subject: [PATCH] Add FAQ for enroot/pyxis compatibility issue with multi-arch images --- README.md | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 322454a9e..cfde95063 100644 --- a/README.md +++ b/README.md @@ -161,18 +161,38 @@ The [JAX image](ghcr.io/nvidia/jax) is embedded with the following flags and env ## FAQ (Frequently Asked Questions) +
+ `bus error` when running JAX in a docker container -Question: A "bus error" ------------------------ +**Solution:** +```bash +docker run -it --shm-size=1g ... +``` -**Q:** When I execute my JAX code, I come across a `bus error`. How can I address this issue? +**Explanation:** +The `bus error` might occur due to the size limitation of `/dev/shm`. You can address this by increasing the shared memory size using +the `--shm-size` option when launching your container. +
-**A:** The `bus error` might occur due to the size limitation of `/dev/shm`. You can address this by increasing the shared memory size using -the `--shm-size` option when launching your container. Here is a demonstration of how this can be achieved using Docker: +
-```bash -docker run -it --shm-size=1g ... +enroot/pyxis reports error code 404 when importing multi-arch images + +**Problem description:** +``` +slurmstepd: error: pyxis: [INFO] Authentication succeeded +slurmstepd: error: pyxis: [INFO] Fetching image manifest list +slurmstepd: error: pyxis: [INFO] Fetching image manifest +slurmstepd: error: pyxis: [ERROR] URL https://ghcr.io/v2/nvidia/jax/manifests/ returned error code: 404 Not Found ``` + +**Solution:** +Upgrade enroot or [apply a single-file patch](https://github.com/NVIDIA/enroot/releases/tag/v3.4.0) as mentioned in the enroot v3.4.0 release note. + +**Explanation:** +Docker has traditionally used Docker Schema V2.2 for multi-arch manifest lists but has switched to using the Open Container Initiative (OCI) format since 20.10. Enroot added support for OCI format in version 3.4.0. +
+ ## JAX on Public Clouds * AWS