Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative to apiDockerCall (and Singularity) for Class Base Jobs #2064

Closed
juanesarango opened this issue Feb 14, 2018 · 4 comments
Closed

Comments

@juanesarango
Copy link
Contributor

juanesarango commented Feb 14, 2018

This is an issue to share some of the work @jsmedmar and I have done. Maybe someone will find it useful as well.

We are toil enthusiasts, and we needed to integrate both docker and singularity support to run containerized system calls within our Class Based Jobs (CBJs).

We realized that apiDockerCall didn't work for CBJs due to job deferring issues. As such we developed a python package called toil_container 🐳 with a Toil Job Class capable of containerized system calls. We didn't submit a PR because we would like to first present it here to gather feedback.

Any comments would be useful and any contribution will be welcome.

EDIT: Check its features and usage in the updated README.md of the toil_container repo.

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-226

@jsmedmar
Copy link
Contributor

This relates to #2058

@DailyDreaming
Copy link
Member

First of all, this is cool and the enthusiasm is appreciated! Singularity support would be great. A PR was attempted for just singularity specifically at #1805 (which was a copy of the subprocessDockerCall function, but with a few lines changed for singularity), but this seems potentially better.

Second, there are also issues specific to this that you may not have run into before. For example:

I noticed you used: self.docker_client = docker.from_env() which will break unless you either use docker.from_env(version='auto') or assiduously update your docker versions so that the main install and the pip version are always in sync (not guaranteed on most user computers either).

Another is the main reason toil vg still uses subprocessDockerCall, in that they pipe stdout/stderr and need to be able to do this for large file streams without it breaking.

Also, I'm not sure about shadowing the function built-in names ( def check_output and def check_call ). These should have their own names.

I really like that it checks the docker and/or singularity environment and the associated errors.

Other lines need looking at and review (things like /dev/null not being closed). I would welcome a PR and I really love the --help augmentation. Thank you for doing this!

jsmedmar added a commit to papaemmelab/toil_container that referenced this issue Feb 27, 2018
This version was motivated by feedback from DataBiosphere/toil#2064 (comment):

* use subprocess32
* use `docker.from_env(version='auto')`
* fix the way `workdir` was mapped to `/tmp` on singularity containers
* avoid shadowing of  subprocess names `def check_output` and `def check_call`
* avoid use of `/dev/null`
@jsmedmar
Copy link
Contributor

@DailyDreaming we addressed your feedback in release 0.2.0, check it out.

We need to finish some stuff at work and then we’ll create two PRs. One for the short parser and the other for the container calls, maybe something like this:

toil/src/lib/parsers.py
toil/src/lib/containers.py

Or maybe:

toil/src/lib/parsers/*
toil/src/lib/containers/*

Let us know what you think. Here are some highlights on toil_container changes:

  • use subprocess32
  • use docker.from_env(version='auto')
  • fix the way workdir was mapped to /tmp on singularity containers
  • avoid shadowing of subprocess names def check_output and def check_call
  • avoid use of /dev/null

@adamnovak
Copy link
Member

We implemented a similar thing in toil-vg for dispatching things to Singularity, Docker, or normal commands as appropriate based on some configuration: https://github.com/vgteam/toil-vg/blob/d4c12287f1362ac3f7d723895c885c555039f8c9/src/toil_vg/vg_common.py#L113

Yours looks pretty polished. It would be great to merge it upstream into Toil! But I'm going to close this issue, because it's just tracking looking at your thing, which is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants