Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Make the pax ARM dockerfile works on AMD64 #280

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 22 additions & 2 deletions .github/container/Dockerfile.pax.arm64
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# syntax=docker/dockerfile:1-labs
###############################################################################
## Pax for AArch64
## Pax for AArch64 and Amd64 for GraceHopper.
## We want both containers to be equivalent.
## GH need special treatments as not all pip wheel support it.
## So this is more complex than what x86 needs.
## Overtime the GH installation should be simpler.
###############################################################################

ARG BASE_IMAGE=ghcr.io/nvidia/jax:latest
Expand All @@ -20,9 +24,22 @@ RUN apt-get update && \
apt-get autoremove -y && apt-get clean && rm -rf /var/lib/apt/lists


RUN wget https://github.com/bazelbuild/bazelisk/releases/download/v1.17.0/bazelisk-linux-arm64 -O /usr/bin/bazel && \
RUN wget https://github.com/bazelbuild/bazelisk/releases/download/v1.17.0/bazelisk-linux-$(dpkg --print-architecture) -O /usr/bin/bazel && \
chmod a+x /usr/bin/bazel

# force a recent tensorflow_datasets version to have latest protobuf dep
RUN pip install tensorflow_datasets==4.9.2 auditwheel tensorflow==2.13.0

## Install tensorflow-text
RUN cd ${INSTALL_DIR} && \
git clone http://github.com/tensorflow/text.git && \
cd text && \
git checkout v2.13.0 && \
./oss_scripts/run_build.sh && \
find * | grep '.whl$' && \
pip install ./tensorflow_text-*.whl && \
cd .. && \
rm -Rf text

# Lingvo
ADD install_lingvo_aarch64.sh /opt/
Expand All @@ -36,6 +53,9 @@ ENV NVTE_FRAMEWORK=jax
ADD install-te.sh /usr/local/bin
RUN install-te.sh

ADD install-flax.sh /usr/local/bin
RUN install-flax.sh

# Install T5 now, Pip will build the wheel from source, it needs Rust.
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs > /tmp/rustup.sh && \
echo "be3535b3033ff5e0ecc4d589a35d3656f681332f860c5fd6684859970165ddcc /tmp/rustup.sh" | sha256sum --check && \
Expand Down
26 changes: 8 additions & 18 deletions .github/container/install_lingvo_aarch64.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,38 +4,28 @@ INSTALL_DIR="${INSTALL_DIR:-/opt}"
LINGVO_REF="${LINGVO_REF:-HEAD}"
LINGVO_REPO="${LINGVO_REPO:-https://github.com/tensorflow/lingvo.git}"

## Install tensorflow-text
cd ${INSTALL_DIR}
pip install tensorflow_datasets==4.9.2 # force a recent version to have latest protobuf dep
pip install auditwheel
pip install tensorflow==2.13.0
git clone http://github.com/tensorflow/text.git
pushd text
git checkout v2.13.0
./oss_scripts/run_build.sh
find * | grep '.whl$'
pip install ./tensorflow_text-*.whl
popd
rm -Rf text

## Install lingvo
## Download lingvo early to fail fast
LINGVO_INSTALLED_DIR=${INSTALL_DIR}/lingvo

[[ -d lingvo ]] || git clone ${LINGVO_REPO} ${LINGVO_INSTALLED_DIR}

pushd ${LINGVO_INSTALLED_DIR}
# Local patches, two PR waiting to be merged + one custom patch
git fetch origin pull/326/head:pr326
# git fetch origin pull/326/head:pr326 ## merged upstream
# git fetch origin pull/328/head:pr328 ## merged upstream
git fetch origin pull/329/head:pr329
git config user.name "JAX Toolbox"
git config user.email "jax@toolbox"
# git cherry-pick pr326 pr328 pr329 ## pr328 merged
git cherry-pick pr326 pr329
# git cherry-pick --allow-empty pr326 pr328 pr329 ## pr326 pr328 merged
git cherry-pick --allow-empty pr329

# Disable 2 flaky tests here
patch -p1 < /opt/lingvo.patch
popd


## Install lingvo
pushd ${LINGVO_INSTALLED_DIR}
sed -i 's/tensorflow=/#tensorflow=/' docker/dev.requirements.txt
sed -i 's/tensorflow-text=/#tensorflow-text=/' docker/dev.requirements.txt
sed -i 's/dataclasses=/#dataclasses=/' docker/dev.requirements.txt
Expand Down