Skip to content

Commit

Permalink
Update base
Browse files Browse the repository at this point in the history
  • Loading branch information
robballantyne committed Feb 8, 2024
1 parent 13a1d14 commit df5e30e
Show file tree
Hide file tree
Showing 6 changed files with 106 additions and 126 deletions.
59 changes: 28 additions & 31 deletions .github/workflows/docker-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ on:
env:
UBUNTU_VERSION: 22.04
BUILDX_NO_DEFAULT_ATTESTATIONS: 1
LATEST_CUDA: "2.1.1-py3.11-cuda-12.1.0-base-22.04"
LATEST_ROCM: "2.1.1-py3.11-rocm-5.6-runtime-22.04"
LATEST_CPU: "2.1.1-py3.11-cpu-22.04"
LATEST_CUDA: "2.2.0-py3.11-cuda-12.1.0-base-22.04"
LATEST_ROCM: "2.2.0-py3.11-rocm-5.6-runtime-22.04"
LATEST_CPU: "2.2.0-py3.11-cpu-22.04"

jobs:
cpu-base:
Expand All @@ -21,10 +21,16 @@ jobs:
python:
- "3.10"
- "3.11"
- "3.12"
pytorch:
- "2.0.1"
- "2.1.0"
- "2.1.1"
- "2.1.2"
- "2.2.0"
exclude:
- python: "3.12"
pytorch: "2.1.1"
- python: "3.12"
pytorch: "2.1.2"
steps:
-
name: Free Space
Expand Down Expand Up @@ -81,31 +87,23 @@ jobs:
python:
- "3.10"
- "3.11"
- "3.12"
pytorch:
- "2.0.1"
- "2.1.0"
- "2.1.1"
- "2.1.2"
- "2.2.0"
cuda:
- "11.7.1"
- "11.8.0"
- "12.1.0"
level:
- "base"
- "runtime"
- "devel"
- "cudnn8-devel"
exclude:
- pytorch: "2.0.1"
cuda: "12.1.0"
- pytorch: "2.1.0"
cuda: "11.7.1"
- pytorch: "2.1.1"
cuda: "11.7.1"
- cuda: "12.1.0"
level: "devel"
- cuda: "11.8.0"
level: "devel"
- cuda: "11.7.1"
level: "devel"
- python: "3.12"
pytorch: "2.1.1"
- python: "3.12"
pytorch: "2.1.2"
steps:
-
name: Free Space
Expand Down Expand Up @@ -161,23 +159,22 @@ jobs:
python:
- "3.10"
- "3.11"
- "3.12"
pytorch:
- "2.0.1"
- "2.1.0"
- "2.1.1"
- "2.1.2"
- "2.2.0"
rocm:
- "5.4.2"
- "5.6"
- "5.7"
level:
- "runtime"
# Templating for future releases
exclude:
- pytorch: "2.0.1"
- pytorch: "2.2.0"
rocm: "5.6"
- pytorch: "2.1.2"
rocm: "5.7"
- python: "3.12"
rocm: "5.6"
- pytorch: "2.1.0"
rocm: "5.4.2"
- pytorch: "2.1.1"
rocm: "5.4.2"
steps:
-
name: Free Space
Expand Down
77 changes: 30 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,23 +34,23 @@ Tags follow these patterns:
##### _CUDA_
- `:[pytorch-version]-py[python-version]-cuda-[x.x.x]-base-[ubuntu-version]`

- `:latest-cuda` → `:2.1.1-py3.11-cuda-12.1.0-base-22.04`
- `:latest-cuda` → `:2.2.0-py3.12-cuda-12.1.0-base-22.04`

##### _ROCm_
- `:[pytorch-version]-py[python-version]-rocm-[x.x.x]-runtime-[ubuntu-version]`

- `:latest-rocm` → `:2.1.1-py3.11-rocm-5.6-runtime-22.04`
- `:latest-rocm` → `:2.2.0-py3.12-rocm-5.7-runtime-22.04`

##### _CPU_
- `:[pytorch-version]-py[python-version]-ubuntu-[ubuntu-version]`

- `:latest-cpu` → `:2.1.1-py3.11-cpu-22.04`
- `:latest-cpu` → `:2.2.0-py3.12-cpu-22.04`

Browse [here](https://github.com/ai-dock/jupyter-pytorch/pkgs/container/jupyter-pytorch) for an image suitable for your target environment.

Supported Python versions: `3.11`, `3.10`
Supported Python versions: `3.12`, `3.11`, `3.10`

Supported Pytorch versions: `2.1.1`, `2.1.0` `2.0.1`
Supported Pytorch versions: `2.2.0`, `2.1.2` `2.1.1`

Supported Platforms: `NVIDIA CUDA`, `AMD ROCm`, `CPU`

Expand Down Expand Up @@ -129,23 +129,21 @@ You can use the included `cloudflared` service to make secure connections withou
| Variable | Description |
| ------------------------ | ----------- |
| `CF_TUNNEL_TOKEN` | Cloudflare zero trust tunnel token - See [documentation](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/). |
| `CF_QUICK_TUNNELS` | Create ephemeral Cloudflare tunnels for web services (default `false`) |
| `CF_QUICK_TUNNELS` | Create ephemeral Cloudflare tunnels for web services (default `true`) |
| `DIRECT_ADDRESS` | IP/hostname for service portal direct links (default `localhost`) |
| `DIRECT_ADDRESS_GET_WAN` | Use the internet facing interface for direct links (default `false`) |
| `GPU_COUNT` | Limit the number of available GPUs |
| `JUPYTER_MODE` | `lab` (default), `notebook` |
| `JUPYTER_PORT` | Set an alternative port (default `8888`) |
| `JUPYTER_TOKEN` | Manually set your password |
| `PROVISIONING_SCRIPT` | URL of a remote script to execute on init. See [note](#provisioning-script). |
| `RCLONE_*` | Rclone configuration - See [rclone documentation](https://rclone.org/docs/#config-file) |
| `SKIP_ACL` | Set `true` to skip modifying workspace ACL |
| `SSH_PORT_LOCAL` | Set a non-standard port for SSH (default `22`) |
| `SSH_PUBKEY` | Your public key for SSH |
| `USER_NAME` | System account username (default `user`)|
| `USER_PASSWORD` | System account username (default `password`)|
| `WEB_ENABLE_AUTH` | Enable password protection for web services (default `true`) |
| `WEB_USER` | Username for web services (default `user`) |
| `WEB_PASSWORD` | Password for web services (default `password`) |
| `WEB_PASSWORD` | Password for web services (default `auto generated`) |
| `WORKSPACE` | A volume path. Defaults to `/workspace/` |
| `WORKSPACE_SYNC` | Move mamba environments and services to workspace if mounted (default `true`) |
| `WORKSPACE_SYNC` | Move mamba environments and services to workspace if mounted (default `false`) |

Environment variables can be specified by using any of the standard methods (`docker-compose.yaml`, `docker run -e...`). Additionally, environment variables can also be passed as parameters of `init.sh`.

Expand All @@ -155,18 +153,32 @@ Example usage: `docker run -e STANDARD_VAR1="this value" -e STANDARD_VAR2="that

## Security

By default, all exposed web services other than the port redirect page are protected by HTTP basic authentication.
All ai-dock containers are interactive and will not drop root privileges. You should ensure that your docker daemon runs as an unprivileged user.

The default username is `user` and the password is `password`.
### System

A system user will be created at startup. The UID will be either 1000 or will match the UID of the `$WORKSPACE` bind mount.

The user will share the root user's ssh public key.

Some processes may start in the user context for convenience only.

### Web Services

By default, all exposed web services are protected by a single login form at `:1111/login`.

The default username is `user` and the password is auto generated unless you have passed a value in the environment variable `WEB_PASSWORD`. To find the auto-generated password and related tokens you should type `env | grep WEB_` from inside the container.

You can set your credentials by passing environment variables as shown above.

The password is stored as a bcrypt hash. If you prefer not to pass a plain text password to the container you can pre-hash and use the variable `WEB_PASSWORD_HASH`.
If you are running the image locally on a trusted network, you may disable authentication by setting the environment variable `WEB_ENABLE_AUTH=false`.

If you need to connect programmatically to the web services you can authenticate using either `Bearer $WEB_TOKEN` or `Basic $WEB_PASSWORD_B64`.

If you are running the image locally on a trusted network, you may disable authentication by setting the environment variable `WEB_ENABLE_AUTH=false`
The security measures included aim to be as secure as basic authentication, i.e. not secure without HTTPS. Please use the provided cloudflare connections wherever possible.

>[!NOTE]
>You can use `set-web-credentials.sh <username> <password>` change the username and password in a running container.
>You can use `set-web-credentials.sh <username> <password>` to change the username and password in a running container.
## Provisioning script

Expand All @@ -179,7 +191,7 @@ The URL must point to a plain text file - GitHub Gists/Pastebin (raw) are suitab
If you are running locally you may instead opt to mount a script at `/opt/ai-dock/bin/provisioning.sh`.

>[!NOTE]
>If configured, `sshd`, `caddy`, `cloudflared`, `rclone`, `jupyter`, `serviceportal`, `storagemonitor` & `logtail` will be launched before provisioning; Any other processes will launch after.
>If configured, `sshd`, `caddy`, `cloudflared`, `jupyter`, `serviceportal`, `storagemonitor` & `logtail` will be launched before provisioning; Any other processes will launch after.
>[!WARNING]
>Only use scripts that you trust and which cannot be changed without your consent.
Expand Down Expand Up @@ -233,8 +245,6 @@ As docker containers generally run as the root user, new files created in /works

To ensure that the files remain accessible to the local user that owns the directory, the docker entrypoint will set a default ACL on the directory by executing the commamd `setfacl -d -m u:${WORKSPACE_UID}:rwx /workspace`.

If you do not want this, you can set the environment variable `SKIP_ACL=true`.

## Running Services

This image will spawn multiple processes upon starting a container because some of our remote environments do not support more than one container per instance.
Expand Down Expand Up @@ -316,33 +326,6 @@ See [this guide](https://link.ai-dock.org/guide-sshd-do) by DigitalOcean for an
>[!NOTE]
>_SSHD is included because the end-user should be able to know the version prior to deloyment. Using a providers add-on, if available, does not guarantee this._
### Rclone mount

Rclone allows you to access your cloud storage from within the container by configuring one or more remotes. If you are unfamiliar with the project you can find out more at the [Rclone website](https://rclone.org/).

Any Rclone remotes that you have specified, either through mounting the config directory or via setting environment variables will be mounted at `/workspace/remote/[remote name]`. For this service to start, the following conditions must be met:

- Fuse3 installed in the host operating system
- Kernel module `fuse` loaded in the host
- Host `/etc/passwd` mounted in the container
- Host `/etc/group` mounted in the container
- Host device `/dev/fuse` made available to the container
- Container must run with `cap-add SYS_ADMIN`
- Container must run with `securiry-opt apparmor:unconfined`
- At least one remote must be configured

The provided docker-compose.yaml includes a working configuration (add your own remotes).

In the event that the conditions listed cannot be met, `rclone` will still be available to use via the CLI - only mounts will be unavailable.

If you intend to use the `rclone create` command to interactively generate remote configurations you should ensure port `53682` is accessible. See https://rclone.org/remote_setup/ for further details.

>[!NOTE]
>_Rclone is included to give the end-user an opportunity to easily transfer files between the instance and their cloud storage provider._
>[!WARNING]
>You should only provide auth tokens in secure cloud environments.
### Logtail

This script follows and prints the log files for each of the above services to stdout. This allows you to follow the progress of all running services through docker's own logging system.
Expand Down
3 changes: 3 additions & 0 deletions build/COPY_ROOT/opt/ai-dock/bin/build/layer0/clean.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@
# Tidy up and keep image small
apt-get clean -y
micromamba clean -ay

rm /etc/ld.so.cache
ldconfig
16 changes: 12 additions & 4 deletions build/COPY_ROOT/opt/ai-dock/bin/build/layer0/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,18 @@ build_common_main() {

build_common_install_jupyter() {
$MAMBA_CREATE -n jupyter python=3.10
printf "/opt/micromamba/envs/jupyter/lib\n" >> /etc/ld.so.conf.d/x86_64-linux-gnu.micromamba.90-jupyter.conf
$MAMBA_INSTALL -n jupyter \
jupyter \
jupyterlab \
nodejs=18
nodejs=20
# This must remain clean. User software should not be in this environment
printf "Removing default ipython kernel...\n"
rm -rf /opt/micromamba/envs/jupyter/share/jupyter/kernels/python3
}

build_common_do_mamba_install() {
$MAMBA_INSTALL -n "$1" -y \
$MAMBA_INSTALL -n "$1" \
ipykernel \
ipywidgets
}
Expand All @@ -31,7 +32,7 @@ build_common_do_kernel_install() {
dir="${kernel_path}${3}/"
file="${dir}kernel.json"
cp -rf ${kernel_path}../_template ${dir}

sed -i 's/DISPLAY_NAME/'"$4"'/g' ${file}
sed -i 's/PYTHON_MAMBA_NAME/'"$1"'/g' ${file}
fi
Expand All @@ -58,12 +59,19 @@ build_common_install_ipykernel() {
build_common_do_kernel_install "python_310" "3.10"
fi

do_mamba_install "python_311"
build_common_do_mamba_install "python_311"
if [[ $PYTHON_MAMBA_NAME = "python_311" ]]; then
build_common_do_kernel_install "python_311" "3.11" "python3" "Python3 (ipykernel)"
else
build_common_do_kernel_install "python_311" "3.11"
fi

build_common_do_mamba_install "python_312"
if [[ $PYTHON_MAMBA_NAME = "python_312" ]]; then
build_common_do_kernel_install "python_312" "3.12" "python3" "Python3 (ipykernel)"
else
build_common_do_kernel_install "python_312" "3.12"
fi
fi
}

Expand Down
48 changes: 30 additions & 18 deletions build/COPY_ROOT/opt/ai-dock/bin/supervisor-jupyter.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,17 @@ trap cleanup EXIT

function cleanup() {
kill $(jobs -p) > /dev/null 2>&1
fuser -k -SIGTERM ${LISTEN_PORT}/tcp > /dev/null 2>&1 &
rm /run/http_ports/$PROXY_PORT > /dev/null 2>&1
}

function start() {
LISTEN_PORT=${JUPYTER_PORT_LOCAL:-18888}
source /opt/ai-dock/etc/environment.sh

LISTEN_PORT=18888
METRICS_PORT=${JUPYTER_METRICS_PORT:-28888}
PROXY_SECURE=true
QUICKTUNNELS=true

if [[ ! -v JUPYTER_PORT || -z $JUPYTER_PORT ]]; then
JUPYTER_PORT=${JUPYTER_PORT_HOST:-8888}
Expand Down Expand Up @@ -61,27 +65,35 @@ function start() {
wait -n
fi

kill $(lsof -t -i:$LISTEN_PORT) > /dev/null 2>&1 &
fuser -k -SIGKILL ${LISTEN_PORT}/tcp > /dev/null 2>&1 &
wait -n

# Allows running in user context when the home directory has non-standard permissions
if [[ $WORKSPACE_MOUNTED == "true" && $WORKSPACE_PERMISSIONS == "false" ]]; then
export JUPYTER_ALLOW_INSECURE_WRITES=true
fi

printf "\nStarting %s...\n" "${SERVICE_NAME:-service}"

exec micromamba run -n jupyter jupyter \
$JUPYTER_MODE \
--allow-root \
--ip=127.0.0.1 \
--port=$LISTEN_PORT \
--no-browser \
--ServerApp.token='' \
--ServerApp.password='' \
--ServerApp.trust_xheaders=True \
--ServerApp.disable_check_xsrf=False \
--ServerApp.allow_remote_access=True \
--ServerApp.allow_origin='*' \
--ServerApp.allow_credentials=True \
--ServerApp.root_dir=$WORKSPACE \
--ServerApp.preferred_dir=$WORKSPACE \
--KernelSpecManager.ensure_native_kernel=False
# Terminado shell_command needs fixing.
# Bash alone invokes neither profile or bashrc.
micromamba run -n jupyter jupyter \
$JUPYTER_MODE \
--allow-root \
--ip=127.0.0.1 \
--port=$LISTEN_PORT \
--no-browser \
--ServerApp.token='' \
--ServerApp.password='' \
--ServerApp.trust_xheaders=True \
--ServerApp.disable_check_xsrf=False \
--ServerApp.allow_remote_access=True \
--ServerApp.allow_origin='*' \
--ServerApp.allow_credentials=True \
--ServerApp.root_dir=$WORKSPACE \
--ServerApp.preferred_dir=$WORKSPACE \
--ServerApp.terminado_settings="{'shell_command': ['bash','-c','bash']}" \
--KernelSpecManager.ensure_native_kernel=False
}

start 2>&1
Loading

0 comments on commit df5e30e

Please sign in to comment.