Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the use of S3 bucket for work directory #25

Closed
bentsherman opened this issue Jun 26, 2023 · 22 comments
Closed

Simplify the use of S3 bucket for work directory #25

bentsherman opened this issue Jun 26, 2023 · 22 comments
Labels
enhancement New feature or request

Comments

@bentsherman
Copy link
Contributor

Here is the config I used in order to use S3 as the work directory:

workDir = "$HOME/nextflow-ci-dev/work"

process.executor = 'float'

float {
    address = '...'
    username = '...'
    password = '...'
    nfs = '[mode=rw]s3://nextflow-ci-dev/work'
}

I also had to mount s3://nextflow-ci-dev to my launch environment.

It would be nice to simplify the S3 usage in two ways:

  • If the user specifies the workDir as an S3 path, set the nfs option automatically. That way they don't have to set it twice or think about how things are mounted or wonder why the setting is called nfs when they are using it for S3 😄

  • Leverage Nextflow's ability to transparently operate on S3 paths through the Path API. In other words, your executor should be able to interact with the work directory without it being mounted locally. That way, the user doesn't need to mount an S3 bucket, they can just run their pipeline.

I initially tried to do this with workDir = 's3://nextflow-ci-dev/work', but it failed with an UnsupportedOperationException. No error trace unfortunately, but it is somewhere in GridTaskHandler::submit(). This exception likely means that the executor tried to do a file operation that isn't supported by S3.

@jealous
Copy link
Contributor

jealous commented Jul 6, 2023

These are very good suggestions. At least we could do item one. But we will also try item two. Thanks!

@jealous jealous added the enhancement New feature or request label Jul 6, 2023
@bentsherman
Copy link
Contributor Author

For number two, I would be happy to help you out since it might be tricky. But it would be a huge win for you guys in terms of attracting Nextflow users, who are quite spoiled and used to not having to mount their S3 bucket. Maybe I can review your code and try to outline the changes you would need to make?

@jealous
Copy link
Contributor

jealous commented Jul 7, 2023

That would be very helpful. Will let you know if we have any problem.

@bentsherman
Copy link
Contributor Author

Since you have a REST API as you said in #24, probably the most straightforward approach would be to model the float executor after the AWS Batch executor in Nextflow rather than the grid executor, using the REST API directly instead of the CLI. The grid executor isn't really the right abstraction for float, because float is running in the cloud and using object storage as the work directory.

If that's too much work, you might be able to get away with customizing the grid task handler for float.

@bentsherman
Copy link
Contributor Author

Cedric, I'm going to try a few things today to see if I can get the S3 paths to work.

I see that the float CLI can accept an S3 path for the job script. Can it accept the job script via stdin?

@bentsherman
Copy link
Contributor Author

Did you change the build system at all? I'm trying to build the plugin following the README, but I when I run it, I get the following error:

$ ./gradlew compileGroovy

BUILD SUCCESSFUL in 3s
21 actionable tasks: 1 executed, 20 up-to-date

$ ./launch.sh run hello -plugins nf-float
Missing '.launch.classpath' file -- create it by running: ./gradlew exportClasspath

$ ./gradlew exportClasspath

FAILURE: Build failed with an exception.

Task 'exportClasspath' not found in root project 'nf-float'.

@bentsherman
Copy link
Contributor Author

Never mind the build issue, I got it to work. I will submit a PR with some general improvements separately.

Let me know if the float CLI can accept stdin, but we should be able to make it work either way.

@bentsherman
Copy link
Contributor Author

Cedric, sorry to spam you with comments. Although I was able to build locally, I'm getting this error when I try to run a pipeline:

$ ./launch.sh run hello -plugins nf-float
N E X T F L O W  ~  version 23.06.0-edge
Launching `https://github.com/nextflow-io/hello` [determined_wright] DSL2 - revision: 1d71f857bb [master]
ERROR ~ Unknown executor name: float

I am just following the README instructions. Please let me know if there is anything else I need to do to make it work.

@jealous
Copy link
Contributor

jealous commented Jul 19, 2023

Looks like the plugin is not loaded. You may need to check .nextflow.log to see if nf-float is loaded.

@jealous
Copy link
Contributor

jealous commented Jul 19, 2023

Unfortunately, the float command command does not support stdin scripts. The work around I can think of is to download the script and put it to some temp directory.

@bentsherman
Copy link
Contributor Author

The nf-lfoat plugin is being loaded, but it's not registering the executor. That's all I can tell for now. I will keep digging, maybe there is a bug in Nextflow. Which version of Nextflow are you building against?

As for the job script, it says the float CLI accepts an S3 path so we can just do that.

@jealous
Copy link
Contributor

jealous commented Jul 19, 2023

The source code I am building against on my dev machine is 23.02.1-edge.
The version I am test with is version 23.04.2 build 5870

@jealous
Copy link
Contributor

jealous commented Jul 19, 2023

Is it possible for you to create a PR in the repo? Maybe I could take a look and debug the code.

@bentsherman
Copy link
Contributor Author

No code changes yet, this is just the base plugin. Here is a summary of what I'm doing:

  • clone nf-float plugin
  • clone nextflow to ../nextflow at revision v23.02.1-edge
  • add includeBuild('../nextflow') to settings.gradle
  • build plugin with ./gradlew compileGroovy
  • test plugin: ./launch.sh run hello -plugins nf-float

At this point I get this error:

Missing '.launch.classpath' file -- create it by running: ./gradlew exportClasspath

This was the first issue I asked about. It looks like the Makefile is missing a build step, compared to nf-hello:

compile:
	./gradlew :nextflow:exportClasspath compileGroovy
	@echo "DONE `date`"

So I run ./gradlew :nextflow:exportClasspath to generate the classpath file.

When I run the plugin again, I get the missing executor error:

 ./launch.sh run hello -plugins nf-float
N E X T F L O W  ~  version 23.02.1-edge
Launching `https://github.com/nextflow-io/hello` [desperate_noyce] DSL2 - revision: 1d71f857bb [master]
Unknown executor name: float

So, how you are building and running this plugin differently from what I have done?

@jealous
Copy link
Contributor

jealous commented Jul 19, 2023

Oh, I compile and generate the zip file with ./gradlew makeZip to create the zip file and them deploy it to $NXF_HOME/plugins folder.
I will also create a configure file include the float scope. Then I can run it with nextflow run <script> -c float.conf

@bentsherman
Copy link
Contributor Author

Okay, I see how that might be more reliable. I'm not sure why the source build isn't working, could be a bug in Nextflow. I will look into it when I have time.

Thanks Cedric. I think that should be enough to get me into testing. I'm going to make a PR with some general suggestions around configuration, and I will look into the improved S3 support.

@jealous
Copy link
Contributor

jealous commented Jul 19, 2023

Thanks, Ben. I will also try to work on this issues these days. If I understand it correctly, I think NextFlow support s3 working directory natively and I just need to make sure those working directory are mounted to the same path on the working node.
Maybe I also need to utilize the utility in NextFlow to somehow download the script from S3 and pass it to the float command line.

@bentsherman
Copy link
Contributor Author

I will go ahead and push a PR with what I have so far.

The float CLI says that it can accept an S3 path for the job script:

-j, --job string                  job file to execute. Support local file, S3 file, OSS file and https|http file

But when I try that, I get this error:

$ float [...] sbatch --dataVolume s3://nf-ireland/work --image quay.io/nextflow/bash --cpu 1 --mem 4 --job s3://nf-ireland/work/3b/ddc65e1ca1ea8967ab197b5337a10b/.command.run --customTag nf-job-id:6mzjkv-2
Error: Invalid argument, please provide a valid s3 url (code: 1003)

Is there some custom syntax required for the S3 path?

@bentsherman
Copy link
Contributor Author

Can you add me as a collaborator so that I can push a branch? Or would you prefer that I create a fork?

@jealous
Copy link
Contributor

jealous commented Jul 20, 2023

I don't have the privilege to add cooperator. Please feel free to create a fork.

@bentsherman
Copy link
Contributor Author

Okay, I made a PR. Let me know what you think about the float CLI error.

jealous added a commit that referenced this issue Jul 22, 2023
Allow user to set s3 work directory.  We will create an option
to tell MMC to mount the s3 bucket on the worker nodes.
jealous added a commit that referenced this issue Jul 22, 2023
Allow user to set s3 work directory.  We will tell float to mount the
s3fs when the work directory is in s3.

After this change, the Nextflow client doesn't need direct access to
the work directory any more.

Remove the support of `float` related options in the `process` scope.

Bump the version to 0.2.0.
jealous added a commit that referenced this issue Jul 22, 2023
Allow user to set s3 work directory.  We will tell float to mount the
s3fs when the work directory is in s3.

After this change, the Nextflow client doesn't need direct access to
the work directory any more.

Remove the support of `float` related options in the `process` scope.

Bump the version to 0.2.0.
jealous added a commit that referenced this issue Jul 23, 2023
Allow user to set s3 work directory.  We will tell float to mount the
s3fs when the work directory is in s3.

After this change, the Nextflow client doesn't need direct access to
the work directory any more.

Remove the support of `float` related options in the `process` scope.

Bump the version to 0.2.0.
jealous added a commit that referenced this issue Jul 23, 2023
Allow user to set s3 work directory.  We will tell float to mount the
s3fs when the work directory is in s3.

After this change, the Nextflow client doesn't need direct access to
the work directory any more.

Remove the support of `float` related options in the `process` scope.

Bump the version to 0.2.0.
jealous added a commit that referenced this issue Jul 23, 2023
Allow user to set s3 work directory.  We will tell float to mount the
s3fs when the work directory is in s3.

After this change, the Nextflow client doesn't need direct access to
the work directory any more.

Remove the support of `float` related options in the `process` scope.

Bump the version to 0.2.0.
jealous added a commit that referenced this issue Jul 23, 2023
Allow user to set s3 work directory.  We will tell float to mount the
s3fs when the work directory is in s3.

After this change, the Nextflow client doesn't need direct access to
the work directory any more.

Remove the support of globa `float` options in the `process` scope.
Remove the default task option in `float` scope.  Please specify them
in the `process` scope.

Bump the version to 0.2.0.
jealous added a commit that referenced this issue Jul 23, 2023
Allow user to set s3 work directory.  We will tell float to mount the
s3fs when the work directory is in s3.

After this change, the Nextflow client doesn't need direct access to
the work directory any more.

Remove the support of globa `float` options in the `process` scope.
Remove the default task option in `float` scope.  Please specify them
in the `process` scope.

Bump the version to 0.2.0.
jealous added a commit that referenced this issue Jul 25, 2023
Retrieve all input and analyze the mount point.  If the work
directory is s3, we need to mount all s3 buckets in the input.

Address review comments.  Copy the code to retrieve s3 credentials
from NextFlow's `Global` class because it will be updated in recent
releases.
jealous added a commit that referenced this issue Jul 25, 2023
Retrieve all input and analyze the mount point.  If the work
directory is s3, we need to mount all s3 buckets in the input.

Address review comments.  Copy the code to retrieve s3 credentials
from NextFlow's `Global` class because it will be updated in recent
releases.
jealous added a commit that referenced this issue Jul 25, 2023
Use Nextflow's default registry and container image if the image
is not specified.  Otherwise raise error.
jealous added a commit that referenced this issue Jul 25, 2023
Use Nextflow's default registry and container image if the image
is not specified.  Otherwise raise error.
jealous added a commit that referenced this issue Jul 26, 2023
Allow user to set s3 work directory.  We will tell float to mount the
s3fs when the work directory is in s3.

After this change, the Nextflow client doesn't need direct access to
the work directory any more.

Remove the support of globa `float` options in the `process` scope.
Remove the default task option in `float` scope.  Please specify them
in the `process` scope.

Bump the version to 0.2.0.
jealous added a commit that referenced this issue Jul 26, 2023
Retrieve all input and analyze the mount point.  If the work
directory is s3, we need to mount all s3 buckets in the input.

Address review comments.  Copy the code to retrieve s3 credentials
from NextFlow's `Global` class because it will be updated in recent
releases.
jealous added a commit that referenced this issue Jul 26, 2023
Use Nextflow's default registry and container image if the image
is not specified.  Otherwise raise error.
@bentsherman
Copy link
Contributor Author

Fixed by #44

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants