Skip to content

Commit

Permalink
#17 implement deployment scripts with docker and ansible
Browse files Browse the repository at this point in the history
  • Loading branch information
trifonov-vl committed Dec 11, 2019
1 parent 83c8ca9 commit d40980f
Show file tree
Hide file tree
Showing 14 changed files with 293 additions and 3 deletions.
7 changes: 7 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.idea
*.pyc
__pycache__
resources
out
logs
.git
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@
__pycache__
resources
out
logs
57 changes: 56 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1007,4 +1007,59 @@ Presence of `entities` and `relations` keys is controlled via respective query p
```
1. Build image with command `docker build . -t derek-container-name`
1. Run container `docker run -p 80:5000 -d derek-container-name`. This command binds host's port 80 to container's port 5000, change it if you wish.
1. Now you can send requests for server available on 80'th port.
1. Now you can send requests for server available on 80'th port.

#### How to run experiments on remote host using Ansible and Docker

We provide some Ansible scripts and Dockerfiles for easy experiment deployment on remote or local host.

Expected experiment pipeline:
1. Prepare `<src_directory>/resources` folder with experiment required resources on your local machine
1. Upload code and resources to host machine
1. Build docker image and run container with experiment on host machine
1. Wait until experiment is over...
1. Fetch results to local machine

To run scripts your host must support Docker and your localhost must support Ansible. Also your host user must be in sudoers (required by Docker).
To use GPUs for experiments your host platform must be supported by [NVIDIA](https://nvidia.github.io/nvidia-docker/) and has CUDA 10 supported driver (>= 410.48) installed.

Firstly you have to prepare [Inventory ansible hosts file](https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html) to use scripts.

All scripts are executed by following template from src path:
```bash
ansible-playbook -i <path-to-inventory> ansible/<script> {-e 'option1=1 option2=2'}
```

If you don't have Docker installed on host:
1. Run install_docker.yml
1. (only for GPU support) Run install_nvidia_extensions.yml

Experiment required resources (embedding models, `props.json`, `lst.json`, segmentation models, dataset etc.) have to be located in `resources` localhost directory inside src directory.
It's recommended to provide `resources/run.sh` with bash command to start experiment -- prepare dataset, run `param_search.py` (notice: it will run from src directory and PYTHONPATH will be already set), don't forget to set execution flag on localhost (`chmod +x resources/run.sh`).
Your experiment script must work with `resources`, `out` and `logs` directories because provided scripts connect them to the same containers directories as volumes (it's done in default shell scripts like `holdout.sh` and `cross-validation.sh` automatically).
Source code and resources will be uploaded to host machine.

Now you are prepared to start experiment:
1. Upload code and resources with `upload_code_resources.yml`. You can specify directory to store files in with ansible option `src_destination` (default=`~/derek`)
1. Build and run experiment container with `build_run_container.yml`.
If you specified directory on previous step, you must specify it here too.
Also you need to specify `command_to_run` option with bash command to run.
If you prepared `resources/run.sh` provide `command_to_run=". resources/run.sh"`.
If you want to use GPUs provide `use_gpu=true` option. You can specify [NVIDIA Docker options](https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(Native-GPU-Support)#usage) with `gpu_options` (default is `all`).

Your final command be like:
```bash
ansible-playbook -i <path-to-inventory> ansible/build_run_container.yml -e 'command_to_run=". resources/run.sh" use_gpu=true src_destination=~/derek-gpu'
```

Now you can login on host and check if container is running with `sudo docker ps`.
You can check command logs with `sudo docker logs <container_name>`. It is useful when something went wrong and container is not running.
You can check volume directories (`resources`, `logs`, `out`) too.

When container is stopped and experiment is over you can download `out` remote host directory contents with `fetch_results.yml`.
You can specify `fetch_directory_name` option for localhost directory name (default is `out`).
You need to specify `src_destination` if it was present on previous steps.

To start new experiments you have to remove `resources` and `out` directory on host. You can do it with `remove_resources.yml` and `remove_results.yml` scripts respectively.
`src_destination` must be set according to previous steps.

38 changes: 38 additions & 0 deletions ansible/build_run_container.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env ansible-playbook
- hosts: all
vars:
src_destination: ~/derek
use_gpu: "false"
gpu_options: "all"
command_to_run:
postfix: "{{ 'gpu' if (use_gpu == 'true') else 'cpu' }}"
tasks:
- name: Get absolute path
command: echo "{{ src_destination }}"
register: abs_path
- name: Set absolute path
set_fact:
src_abs_destination: "{{ abs_path.stdout }}"
- name: Build container
command: docker build . -f "dockerfiles/Dockerfile.{{ postfix }}" -t "derek-{{ postfix }}"
become: true
args:
chdir: "{{ src_abs_destination }}"
- name: Run cpu container
command: >
docker run -d -v "{{ src_abs_destination }}/resources":/derek/resources
-v "{{ src_abs_destination }}/out":/derek/out -v "{{ src_abs_destination }}/logs":/derek/logs
derek-cpu /bin/bash -c "cd /derek; {{ command_to_run }}"
become: true
args:
chdir: "{{ src_abs_destination }}"
when: use_gpu != 'true'
- name: Run gpu container
command: >
docker run --gpus "{{ gpu_options }}" -d -v "{{ src_abs_destination }}/resources":/derek/resources
-v "{{ src_abs_destination }}/out":/derek/out -v "{{ src_abs_destination }}/logs":/derek/logs
derek-gpu /bin/bash -c "cd /derek; {{ command_to_run }}"
become: true
args:
chdir: "{{ src_abs_destination }}"
when: use_gpu == 'true'
11 changes: 11 additions & 0 deletions ansible/fetch_results.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env ansible-playbook
- hosts: all
vars:
src_destination: ~/derek
fetch_directory_name: out
tasks:
- name: Fetch results
synchronize:
src: "{{ src_destination }}/out/"
dest: "../{{ fetch_directory_name }}"
mode: pull
21 changes: 21 additions & 0 deletions ansible/install_docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env ansible-playbook
- hosts: all
become: true
tasks:
- name: Add Docker GPG key
apt_key: url=https://download.docker.com/linux/ubuntu/gpg

- name: Install basic list of packages
apt:
name: ['apt-transport-https','ca-certificates','curl','gnupg2','software-properties-common']
state: present
update_cache: yes

- name: Add Docker APT repository
apt_repository:
repo: deb [arch=amd64] https://download.docker.com/linux/{{ansible_distribution|lower}} {{ansible_distribution_release}} stable

- name: Install Docker packages
apt:
name: ['docker-ce','docker-ce-cli','containerd.io']
state: present
23 changes: 23 additions & 0 deletions ansible/install_nvidia_extensions.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/env ansible-playbook
- hosts: all
become: true
tasks:
- name: Add nvidia-docker GPG key
apt_key: url=https://nvidia.github.io/nvidia-docker/gpgkey

- name: Get nvidia-docker list
get_url:
url: "https://nvidia.github.io/nvidia-docker/{{ hostvars[inventory_hostname].ansible_distribution|lower}}\
{{ hostvars[inventory_hostname].ansible_distribution_version }}/nvidia-docker.list"
dest: "/etc/apt/sources.list.d/nvidia-docker.list"

- name: Install nvidia-container-toolkit
apt:
name: ['nvidia-container-toolkit']
state: present
update_cache: yes

- name: Restart docker
systemd:
name: docker
state: restarted
17 changes: 17 additions & 0 deletions ansible/remove_resources.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env ansible-playbook
- hosts: all
vars:
src_destination: ~/derek
tasks:
- name: Get absolute path
command: echo "{{ src_destination }}"
register: abs_path
- name: Set absolute path
set_fact:
src_abs_destination: "{{ abs_path.stdout }}"

- name: Delete resources directory
file:
path: "{{ src_abs_destination }}/resources/"
state: absent
become: true
17 changes: 17 additions & 0 deletions ansible/remove_results.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env ansible-playbook
- hosts: all
vars:
src_destination: ~/derek
tasks:
- name: Get absolute path
command: echo "{{ src_destination }}"
register: abs_path
- name: Set absolute path
set_fact:
src_abs_destination: "{{ abs_path.stdout }}"

- name: Delete out directory
file:
path: "{{ src_abs_destination }}/out/"
state: absent
become: true
45 changes: 45 additions & 0 deletions ansible/upload_code_resources.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/usr/bin/env ansible-playbook
- hosts: all
vars:
src_destination: ~/derek
tasks:
- name: Create directory
file:
path: "{{ src_destination }}"
state: directory

- name: Upload main srcs
synchronize:
src: ../derek/
dest: "{{ src_destination }}/derek"
delete: yes

- name: Upload tools srcs
synchronize:
src: ../tools/
dest: "{{ src_destination }}/tools"
delete: yes

- name: Upload digger srcs
synchronize:
src: ../babylondigger/
dest: "{{ src_destination }}/babylondigger"
delete: yes

- name: Upload dockerfiles
synchronize:
src: ../dockerfiles/
dest: "{{ src_destination }}/dockerfiles"
delete: yes

- name: Upload resources
synchronize:
src: ../resources/
dest: "{{ src_destination }}/resources"
delete: yes

- name: Upload requirements.txt
synchronize:
src: ../requirements.txt
dest: "{{ src_destination }}/requirements.txt"
delete: yes
24 changes: 24 additions & 0 deletions dockerfiles/Dockerfile.cpu
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
FROM ubuntu:18.04
MAINTAINER Trifonov Vladislav <trifonov@ispras.ru>

RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y locales \
&& sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen \
&& dpkg-reconfigure --frontend=noninteractive locales \
&& update-locale LANG=en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

RUN apt-get update && apt-get install -qq -y python3 python3-pip
ENV PYTHONPATH $PYTHONPATH:/derek:/derek/babylondigger:/derek/babylondigger/tdozat-parser-v3
WORKDIR /derek
COPY requirements.txt .
RUN pip3 install -r ./requirements.txt
COPY tools/requirements.txt ./tools/
RUN pip3 install -r ./tools/requirements.txt
RUN python3 -c "import nltk;nltk.download('punkt')"

COPY babylondigger ./babylondigger
COPY tools ./tools
COPY derek ./derek
25 changes: 25 additions & 0 deletions dockerfiles/Dockerfile.gpu
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04
MAINTAINER Trifonov Vladislav <trifonov@ispras.ru>

RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y locales \
&& sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen \
&& dpkg-reconfigure --frontend=noninteractive locales \
&& update-locale LANG=en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

RUN apt-get update && apt-get install -qq -y python3 python3-pip
ENV PYTHONPATH $PYTHONPATH:/derek:/derek/babylondigger:/derek/babylondigger/tdozat-parser-v3
WORKDIR /derek
COPY requirements.txt .
RUN sed -i "s/tensorflow/tensorflow-gpu/g" requirements.txt
RUN pip3 install -r ./requirements.txt
COPY tools/requirements.txt ./tools/
RUN pip3 install -r ./tools/requirements.txt
RUN python3 -c "import nltk;nltk.download('punkt')"

COPY babylondigger ./babylondigger
COPY tools ./tools
COPY derek ./derek
5 changes: 4 additions & 1 deletion tools/cross-validation.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,12 @@ then
unlabeled="-unlabeled $5"
fi

logs_path=logs
mkdir -p ${logs_path}

current_time=$(date +%Y-%m-%d--%H-%M-%S)

python3 -u tools/param_search.py -task $3 -props resources/prop.json -lst resources/lst.json \
-seeds $4 -out ${out_path} ${unlabeled} \
cross_validation -traindev ${input_path} -folds $2 \
2>> err--${current_time}.log | tee out--${current_time}.log
2>${logs_path}/err--${current_time}.log | tee ${logs_path}/out--${current_time}.log
5 changes: 4 additions & 1 deletion tools/holdout.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,12 @@ then
unlabeled="-unlabeled $5"
fi

logs_path=logs
mkdir -p ${logs_path}

current_time=$(date +%Y-%m-%d--%H-%M-%S)

python3 -u tools/param_search.py -task $3 -props resources/prop.json -lst resources/lst.json \
-seeds $4 -out ${out_path} ${unlabeled} \
holdout -train ${train_path} -dev ${dev_path} \
2>> err--${current_time}.log | tee out--${current_time}.log
2>${logs_path}/err--${current_time}.log | tee ${logs_path}/out--${current_time}.log

0 comments on commit d40980f

Please sign in to comment.